首页> 外文会议>International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics >Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation
【24h】

Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

机译:超越噪音:减轻细粒度语义分歧对神经电机翻译的影响

获取原文

摘要

While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise. As a result, it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact NMT training. To close this gap, we analyze the impact of different types of fine-grained semantic divergences on Transformer models. We show that models trained on synthetic divergences output degenerated text more frequently and are less confident in their predictions. Based on these findings, we introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences, improving both translation quality and model calibration on EN←→FR tasks.
机译:虽然已经表明,神经电机翻译(NMT)对嘈杂的并行训练样本非常敏感,但是在源自源和目标之间的所有类型的不匹配作为噪声。 因此,仍然不清楚最多等同物但包含少数语义发散令牌影响NMT培训的样本如何如何如何训练样品。 为了缩短这种差距,我们分析了不同类型的细粒度语义分歧对变压器模型的影响。 我们展示了综合分流训练的模型更频繁地输出退化文本,并且对其预测不太自信。 基于这些发现,我们介绍了一种不同的感知NMT框架,它使用因素来帮助NMT从天然发生的分歧引起的劣化中恢复,从而提高ZH←→FR任务上的翻译质量和模型校准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号