Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data

机译：基于可比数据的潜在跨语言概念的语境中跨语言语义相似性概率模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose the first probabilistic approach to modeling cross-lingual semantic similarity (CLSS) in context which requires only comparable data. The approach relies on an idea of projecting words and sets of words into a shared latent semantic space spanned by language-pair independent latent semantic concepts (e.g., cross-lingual topics obtained by a multilingual topic model). These latent cross-lingual concepts are induced from a comparable corpus without any additional lexical resources. Word meaning is represented as a probability distribution over the latent concepts, and a change in meaning is represented as a change in the distribution over these latent concepts. We present new models that modulate the isolated out-of-context word representations with contextual knowledge. Results on the task of suggesting word translations in context for 3 language pairs reveal the utility of the proposed contextualized models of cross-lingual semantic similarity.

机译：我们提出了在仅需要可比较数据的情况下建模跨语言语义相似性（CLSS）的第一种概率方法。该方法依赖于将单词和单词集投影到由语言对独立的潜在语义概念（例如，由多语言主题模型获得的跨语言主题）所跨越的共享潜在语义空间中的想法。这些潜在的跨语言概念是从可比较的语料库中得出的，而没有任何其他词汇资源。单词的含义表示为潜在概念上的概率分布，而含义的变化表示为这些潜在概念上的分布变化。我们提出了新的模型，可以用上下文知识来调制孤立的上下文外单词表示形式。建议针对3种语言对在上下文中进行单词翻译的任务的结果表明，所提出的跨语言语义相似性的上下文模型具有实用性。

著录项

来源
《Conference on empirical methods in natural language processing》|2014年|349-362|共14页
会议地点
作者
Ivan Vulic; Marie-Francine Moens;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Selvam M., Natarajan A. M. Current Science: A Fortnightly Journal of Research . 2010,第7期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
2. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Natarajan A. M., Selvam M. Current science . 2010,第07期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
3. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Natarajan A. M., Selvam M. Current science . 2010,第07期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
4. Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data [C] . Ivan Vulic, Marie-Francine Moens Conference on empirical methods in natural language processing . 2014

机译：基于可比数据诱导的潜在交叉思想概念的上下文交叉语义相似性的概率模型
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations [O] . Danushka Bollegala, Georgios Kontonatsios, Sophia Ananiadou -1

机译：用于检测生物医学术语翻译的跨语言相似性度量
7. Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data [O] . Ivan Vulic ́, Marie-francine Moens 2015

机译：基于可比数据诱导的潜在跨语言概念的语境中跨语言语义相似度的概率模型

Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data

摘要

著录项

相似文献

相关主题

期刊订阅