首页> 外文会议>International joint conference on natural language processing >Bilingual Word Embeddings for Bilingual Terminology Extraction from Specialized Comparable Corpora
【24h】

Bilingual Word Embeddings for Bilingual Terminology Extraction from Specialized Comparable Corpora

机译:从专门的可比语料库中提取双语术语的双语词嵌入

获取原文

摘要

Bilingual lexicon extraction from comparable corpora is constrained by the small amount of available data when dealing with specialized domains. This aspect penalizes the performance of distributional-based approaches, which is closely related to the reliability of word's cooccurrence counts extracted from comparable corpora. A solution to avoid this limitation is to associate external resources with the comparable corpus. Since bilingual word embeddings have recently shown efficient models for learning bilingual distributed representation of words, we explore different word embedding models and show how a general-domain comparable corpus can enrich a specialized comparable corpus via neural networks.
机译:当处理特殊领域时,可比较语料库的双语词典提取受到少量可用数据的限制。这方面不利于基于分布的方法的性能,这与从可比语料库中提取的单词共现计数的可靠性密切相关。避免此限制的解决方案是将外部资源与可比较的语料库关联。由于双语单词嵌入最近显示了用于学习单词的双语分布式表示的有效模型,因此我们探索了不同的单词嵌入模型,并展示了通用域可比语料库如何通过神经网络丰富特定的可比语料库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号