Improving Neural Machine Translation Models with Monolingual Data

机译：用单语数据改进神经机器翻译模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Target-side monolingual data plays an important role in boosting fluency for phrase-based statistical machine translation, and we investigate the use of monolingual data for NMT. In contrast to previous work, which combines NMT models with separately trained language models, we note that encoder-decoder NMT architectures already have the capacity to learn the same information as a language model, and we explore strategies to train with monolingual data without changing the neural network architecture. By pairing monolingual training data with an automatic back-translation, we can treat it as additional parallel training data, and we obtain substantial improvements on the WMT 15 task English(←→)German (+2.8-3.7 Bleu), and for the low-resourced IWSLT 14 task Turkish(←→)English (+2.1-3.4 Bleu), obtaining new state-of-the-art results. We also show that fine-tuning on in-domain monolingual and parallel data gives substantial improvements for the IWSLT 15 task English→German.

机译：神经机器翻译（NMT）在使用几种并行数据进行训练的同时，已经获得了几种语言对的最新性能。目标方单语数据在提高基于短语的统计机器翻译的流利性方面起着重要作用，我们研究了将单语数据用于NMT的情况。与将NMT模型与经过单独训练的语言模型相结合的先前工作相比，我们注意到编码器/解码器NMT体系结构已经具有学习与语言模型相同的信息的能力，并且我们探索了在不更改语言的情况下进行单语言数据训练的策略。神经网络架构。通过将单语种培训数据与自动反向翻译配对，我们可以将其视为附加的并行培训数据，并且在WMT 15任务英语（←→）德语（+ 2.8-3.7 Bleu）上获得了实质性的改进，而对于资源的IWSLT 14任务土耳其语（←→）英语（+ 2.1-3.4 Bleu），获得了最新的最新结果。我们还显示，对域内单语和并行数据进行的微调对IWSLT 15任务English→German做出了重大改进。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2016年|86-96|共11页
会议地点
作者
Rico Sennrich; Barry Haddow; Alexandra Birch;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Effectively training neural machine translation models with monolingual data [J] . Yang Zhen, Chen Wei, Wang Feng, Neurocomputing . 2019,第MARa14期

机译：用单语数据有效地训练神经机器翻译模型
2. A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data [J] . Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa, Engineering Letters . 2021,第4期

机译：一种使用单晶体数据改进低资源神经机平移的混合方法
3. Improved Monolingual Hypothesis Alignment for Machine Translation System Combination [J] . XIAODONG HE, MEI YANG, JIANFENG GAO, ACM transactions on Asian language information processing . 2009,第2期

机译：用于机器翻译系统组合的改进的单语言假设对齐
4. Improving Neural Machine Translation Models with Monolingual Data [C] . Rico Sennrich, Barry Haddow, Alexandra Birch Annual meeting of the Association for Computational Linguistics . 2016

机译：用单晶体数据改进神经机翻译模型
5. Improved Neural Machine Translation Systems for Low Resource Correction Tasks [D] . Harer, Jacob. 2019

机译：改进的神经电机翻译系统，用于低资源校正任务
6. Improving Neural Machine Translation by Filtering Synthetic Parallel Data [O] . Guanghao Xu, Youngjoong Ko, Jungyun Seo 2019

机译：通过过滤合成并行数据来改善神经电机转换
7. Improving Neural Machine Translation Models with Monolingual Data [O] . Sennrich, Rico, Haddow, Barry, Birch, Alexandra 2016

机译：用单语数据改进神经机器翻译模型

Improving Neural Machine Translation Models with Monolingual Data

摘要

著录项

相似文献

相关主题

期刊订阅