【24h】

Measuring and Increasing Context Usage in Context-Aware Machine Translation

机译:在上下文知识机器翻译中测量和增加上下文使用

获取原文

摘要

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context - context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify the usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that conditioning on a longer context has a diminishing effect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method increases context usage and that this reflects on the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets.
机译:神经电机翻译中最近的工作已经证明了使用除了目前被翻译的句子中的句子中的情节上下文的必要性和可行性。但是,虽然许多当前方法存在模型架构,从理论上可以使用这种额外的上下文,但通常不清楚他们在翻译时实际使用它。在本文中,我们介绍了新的公制,条件交叉相互信息,以通过这些模型量化上下文的使用情况。使用此度量标准,我们测量文件级机器翻译系统使用特定品种的上下文。我们发现目标上下文是引用的更多源上下文,并且更长的上下文的调节对结果效果递减。然后,我们介绍一个新的简单训练方法,上下文知识的单词丢失,以通过上下文感知模型增加上下文的使用。实验表明,我们的方法会增加上下文使用,这反映了根据BLEU和COMET等度量的转换质量,以及在化学性代词分辨率和词汇凝聚对比数据集中的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号