We develop a novel cross-lingual word representation model which injects syn tactic information through dependency-based contexts into a shared cross-lingual word vector space. The model, termed CL-DepEmb, is based on the following as sumptions: (1) dependency relations are largely language-independent, at least for related languages and prominent depen dency links such as direct objects, as ev idenced by the Universal Dependencies project; (2) word translation equivalents take similar grammatical roles in a sen tence and are therefore substitutable within their syntactic contexts. Experiments with several language pairs on word similarity and bilingual lexicon induction, two fun damental semantic tasks emphasising se mantic similarity, suggest the usefulness of the proposed syntactically informed cross-lingual word vector spaces. Improvements are observed in both tasks over standard cross-lingual "offline mapping" baselines trained using the same setup and an equal level of bilingual supervision.
展开▼