Cross-lingual word representations offer an elegant and language-pair independent way to represent content across different languages. They enable us to reason about word meaning in multilingual contexts and serve as an integral source of knowledge for multilingual applications such as machine translation (Artetxe et al., 2018d; Qi et al., 2018; Lample et al., 2018b) or multilingual search and question answering (Vulic and Moens, 2015). In addition, they are a key facilitator of cross-lingual transfer and joint multilingual training, offering support to NLP applications in a large spectrum of languages (Søgaard et al., 2015; Ammar et al., 2016a). While NLP is increasingly more embedded into a variety of products related to, e.g., translation, conversational or search tasks, resources such as annotated training data are still lacking or insufficient to induce satisfying models for many resource-poor languages. There arc often no trained linguistic annotators for these languages, and markets may be too small or premature to invest in such training. This is a major challenge, but cross-lingual modelling and transfer can help by exploiting observable correlations between major languages and low-resource languages.
展开▼