Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

机译：超越噪音：减轻细粒度语义分歧对神经电机翻译的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise. As a result, it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact NMT training. To close this gap, we analyze the impact of different types of fine-grained semantic divergences on Transformer models. We show that models trained on synthetic divergences output degenerated text more frequently and are less confident in their predictions. Based on these findings, we introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences, improving both translation quality and model calibration on EN←→FR tasks.

机译：虽然已经表明，神经电机翻译（NMT）对嘈杂的并行训练样本非常敏感，但是在源自源和目标之间的所有类型的不匹配作为噪声。因此，仍然不清楚最多等同物但包含少数语义发散令牌影响NMT培训的样本如何如何如何训练样品。为了缩短这种差距，我们分析了不同类型的细粒度语义分歧对变压器模型的影响。我们展示了综合分流训练的模型更频繁地输出退化文本，并且对其预测不太自信。基于这些发现，我们介绍了一种不同的感知NMT框架，它使用因素来帮助NMT从天然发生的分歧引起的劣化中恢复，从而提高ZH←→FR任务上的翻译质量和模型校准。

著录项

来源
《International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics》|2021年|7236-7249|共14页
会议地点
作者
Eleftheria Briakou; Marine Carpuat;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Fine-grained attention mechanism for neural machine translation [J] . Choi Heeyoul, Cho Kyunghyun, Bengio Yoshua Neurocomputing . 2018,第APRa5期

机译：神经机器翻译的细粒度注意机制
2. Neural Machine Translation for Semantic-Driven Q&A Systems in the Factory Planning [J] . Uwe Dombrowski, Alexander Reiswich, Raphael Lamprecht Procedia CIRP . 2021,第Suppla1期

机译：用于语义驱动的神经电脑翻译，工厂规划中的系统
3. Semantic and syntactic information for neural machine translation [J] . Armengol-Estape Jordi, Costa-jussa Marta R. Machine translation . 2021,第1期

机译：神经机翻译的语义与句法信息
4. Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation [C] . Marine Carpuat, Yogarshi Vyas, Xing Niu First workshop on natural machine translation 2017 . 2017

机译：检测跨语言语义差异的神经机器翻译
5. Latent Semantic Analysis, Corpus stylistics and Machine Learning Stylometry for Translational and Authorial Style Analysis: The Case of Denys Johnson-Davies' Translations into English. [D] . Al Batineh, Mohammed. 2015

机译：潜在语义分析，语料库样式学和机器学习样式法，用于翻译和作者风格分析：以Denys Johnson-Davies的英语翻译为例。
6. Automatic Round-the-Clock Detection of Whales for Mitigation from Underwater Noise Impacts [O] . Daniel P. Zitterbart, Lars Kindermann, Elke Burkhardt, -1

机译：自动全天候检测鲸鱼以减轻水下噪声的影响
7. Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation [O] . Marine Carpuat, Yogarshi Vyas, Xing Niu 2017

机译：检测神经机翻译的交叉语义发散
8. Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation [R] . Habash, N. , Dorr, B. 2002

机译：处理发电 - 重型混合机器翻译中的翻译差异

Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅