Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

机译：非自动报道神经机翻译的连续空间迭代精制

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an efficient inference procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input. This allows us to use gradient-based optimization to find the target sentence at inference time that approximately maximizes its marginal probability. As each refinement step only involves computation in the latent space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive inference procedures that often refine in token space. We compare our approach to a recently proposed EM-like inference procedure (Shu et al., 2020) that optimizes in a hybrid space, consisting of both discrete and continuous variables. We evaluate our approach on WMT' 14 En→De, WMT' 16 Ro→En and IWSLT' 16 De→En, and observe two advantages over the EM-like inference: (1) it is computationally efficient, i.e. each refinement step is twice as fast, and (2) it is more effective, resulting in higher marginal probabilities and BLEU scores with the same number of refinement steps. On WMT' 14 En→De, for instance, our approach is able to decode 6.2 times faster than the autoregressive model with minimal degradation to translation quality (0.9 BLEU).

机译：我们建议对非自回归机器翻译的有效推理的过程，反复提炼翻译纯粹的连续空间。给定机器翻译连续潜变量模型（Shu等人，2020），我们培养的推断网络来近似目标句子的边缘数概率的梯度，仅使用潜在变量作为输入。这使我们能够使用基于梯度的优化，以找到在这个近似最大化它的边际概率推理时的目标判决。由于每个细化步骤只涉及低维的潜在空间计算（我们在我们的实验中使用8），我们避免现有非自回归推断过程，往往细化令牌空间所产生的计算开销。我们我们的做法比较最近提出的EM样的推理过程（舒等人，2020年），优化的混合空间，包括离散和连续变量。（1）它是计算效率，即每个细化的步骤是：我们在EM般推理评估我们对WMT“14恩→德，WMT” 16滚装→恩和IWSLT” 16德→恩的做法，并观察两个优势快两倍，和（2）它是更有效的，从而导致较高的边际概率和得分BLEU具有相同数目的细化步骤。在WMT” 14恩→德，例如，我们的做法是能够比最低的功能退化对翻译质量（0.9 BLEU）的自回归模型解码快6.2倍。

著录项

来源
《Conference on Empirical Methods in Natural Language Processing》|2020年|1006-1015|共10页
会议地点
作者
Jason Lee; Raphael Shu; Kyunghyun Cho;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems [J] . Marie Benjamin, Fujita Atsushi ACM transactions on Asian language information processing . 2020,第5期

机译：无监督的神经和统计机器翻译系统迭代培训
2. Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation [J] . RUI WANG, MASAO UTIYAMA, ISAO GOTO, ACM transactions on Asian language information processing . 2016,第3期

机译：通过高效的双语修剪将连续空间语言模型转换为N-gram语言模型以进行统计机器翻译
3. Patent Issued for Machine Translation in Continuous Space [J] . Robotics and Machine Learning . 2012,第32期

机译：连续空间中的机器翻译已获专利
4. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement [C] . Jason Lee, Elman Mansimov, Kyunghyun Cho Conference on empirical methods in natural language processing . 2018

机译：通过迭代细化确定性的非自回归神经序列建模
5. Neural Structured Prediction Using Iterative Refinement with Applications to Text and Molecule Generation [D] . Mansimov, Elman. 2021

机译：使用迭代细化与文本和分子生成的神经结构预测
6. Improved crystallographic models through iterated local density-guided model deformation and reciprocal-space refinement [O] . Thomas C. Terwilliger, Randy J. Read, Paul D. Adams, -1

机译：通过迭代的局部密度引导模型变形和倒数空间细化改进晶体学模型
7. Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation [O] . Chenze Shao, Jinchao Zhang, Yang Feng, 2020

机译：最小化非自动评级神经机翻译的袋子差异

Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅