首页> 外文会议>International conference on intelligent text processing and computational linguistics >Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space
【24h】

Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

机译:连续向量空间中看不见单词的即时翻译模型自适应

获取原文

摘要

In statistical machine translation (SMT), differences between domains of training and test data result in poor translations. Although there have been many studies on domain adaptation of language models and translation models, most require supervised in-domain language resources such as parallel corpora for training and tuning the models. The necessity of supervised data has made such methods difficult to adapt to practical smt systems. We thus propose a novel method that adapts translation models without in-domain parallel corpora. Our method infers translation candidates of unseen words by nearest-neighbor search after projecting their vector-based semantic representations to the semantic space of the target language. In our experiment of out-of-domain translation from Japanese to English, our method improved bleu score by 0.5-1.5.
机译:在统计机器翻译(SMT)中,培训领域和测试数据之间的差异导致翻译质量不佳。尽管对语言模型和翻译模型的领域适应性进行了很多研究,但大多数研究都需要受监督的领域内语言资源,例如用于训练和调整模型的并行语料库。监督数据的必要性使得这种方法难以适应实际的smt系统。因此,我们提出了一种无需域内并行语料库即可适应翻译模型的新颖方法。我们的方法将基于向量的语义表示投影到目标语言的语义空间后,通过最近邻搜索来推断未见单词的翻译候选。在我们从日语到英语的域外翻译实验中,我们的方法将bleu得分提高了0.5-1.5。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号