首页> 外文会议>International conference on recent advances in natural language processing >Enriching Word Sense Embeddings with Translational Context
【24h】

Enriching Word Sense Embeddings with Translational Context

机译:通过翻译上下文丰富词义嵌入

获取原文

摘要

Vector-space models derived from corpora are an effective way to learn a representation of word meaning directly from data, and these models have many uses in practical applications. A number of unsu-pervised approaches have been proposed to automatically learn representations of word senses directly from corpora, but since these methods use no information but the words themselves, they sometimes miss distinctions that could be possible to make if more information were available. In this paper, we present a general framework that we call context enrichment that incorporates external information during the training of multi-sense vector-space models. Our approach is agnostic as to which external signal is used to enrich the context, but in this work we consider the use of translations as the source of enrichment. We evaluated the models trained using the translation-enriched context using several similarity benchmarks and a word analogy test set. In all our evaluations, the enriched model outperformed the purely word-based baseline soundly.
机译:从语料库派生的向量空间模型是一种直接从数据中学习单词含义表示的有效方法,并且这些模型在实际应用中有许多用途。已经提出了许多未经监督的方法来直接直接从语料库中自动学习词义的表示,但是由于这些方法仅使用单词本身而不使用任何信息,因此有时会遗漏如果有更多信息可用就可能做出的区分。在本文中,我们提出了一个通用的框架,我们称其为上下文丰富,该框架在训练多感觉向量空间模型时会结合外部信息。对于使用哪种外部信号来丰富上下文,我们的方法是不可知的,但是在这项工作中,我们考虑使用翻译作为丰富的来源。我们使用几种相似性基准和词类比测试集评估了使用翻译丰富的上下文训练的模型。在我们所有的评估中,丰富的模型在性能上均胜过纯粹基于单词的基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号