首页> 外文会议>Conference on empirical methods in natural language processing >Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data
【24h】

Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data

机译:基于可比数据的潜在跨语言概念的语境中跨语言语义相似性概率模型

获取原文

摘要

We propose the first probabilistic approach to modeling cross-lingual semantic similarity (CLSS) in context which requires only comparable data. The approach relies on an idea of projecting words and sets of words into a shared latent semantic space spanned by language-pair independent latent semantic concepts (e.g., cross-lingual topics obtained by a multilingual topic model). These latent cross-lingual concepts are induced from a comparable corpus without any additional lexical resources. Word meaning is represented as a probability distribution over the latent concepts, and a change in meaning is represented as a change in the distribution over these latent concepts. We present new models that modulate the isolated out-of-context word representations with contextual knowledge. Results on the task of suggesting word translations in context for 3 language pairs reveal the utility of the proposed contextualized models of cross-lingual semantic similarity.
机译:我们提出了在仅需要可比较数据的情况下建模跨语言语义相似性(CLSS)的第一种概率方法。该方法依赖于将单词和单词集投影到由语言对独立的潜在语义概念(例如,由多语言主题模型获得的跨语言主题)所跨越的共享潜在语义空间中的想法。这些潜在的跨语言概念是从可比较的语料库中得出的,而没有任何其他词汇资源。单词的含义表示为潜在概念上的概率分布,而含义的变化表示为这些潜在概念上的分布变化。我们提出了新的模型,可以用上下文知识来调制孤立的上下文外单词表示形式。建议针对3种语言对在上下文中进行单词翻译的任务的结果表明,所提出的跨语言语义相似性的上下文模型具有实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号