Variable-Length Word Encodings for Neural Translation Models

机译：神经翻译模型的可变长度单词编码

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent work in neural machine translation has shown promising performance, but the most effective architectures do not scale naturally to large vocabulary sizes. We propose and compare three variable-length encoding schemes that represent a large vocabulary corpus using a much smaller vocabulary with no loss in information. Common words are unaffected by our encoding, but rare words are encoded using a sequence of two pseudo-words. Our method is simple and effective: it requires no complete dictionaries, learning procedures, increased training time, changes to the model, or new parameters. Compared to a baseline that replaces all rare words with an unknown word symbol, our best variable-length encoding strategy improves WMT English-French translation performance by up to 1.7 BLEU.

机译：神经机器翻译的最新工作已显示出令人鼓舞的性能，但是最有效的体系结构并不能自然扩展到大词汇量。我们提出并比较了三种变长编码方案，它们使用较小的词汇表述了一个大型词汇语料库，而没有信息丢失。普通字不受我们的编码的影响，但稀有字是使用两个伪字序列编码的。我们的方法简单有效：它不需要完整的词典，学习程序，增加的培训时间，更改模型或新的参数。与使用未知词符号替换所有稀有词的基准相比，我们最好的可变长度编码策略将WMT英法翻译性能提高了1.7 BLEU。

著录项

来源
《Conference on empirical methods in natural language processing》|2015年|2088-2093|共6页
会议地点
作者
Rohan Chitnis; John DeNero;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Explicitly Modeling Word Translations in Neural Machine Translation [J] . Han Dong, Li Junhui, Li Yachao, ACM transactions on Asian language information processing . 2020,第1期

机译：在神经机器翻译中显式建模单词翻译
2. The representational geometry of word meanings acquired by neural machine translation models [J] . Hill Felix, Cho Kyunghyun, Jean Sébastien, Machine translation . 2017,第1a2期

机译：神经机器翻译模型获得的词义表示几何
3. Towards Integrated Classification Lexicon for Handling Unknown Words in Chinese-Vietnamese Neural Machine Translation [J] . WANJIN CHE, ZHENGTAO YU, ZHIQIANG YU, ACM transactions on Asian and low-resource language information processing . 2020,第3期

机译：朝着综合分类词典，用于处理中越神经电脑翻译中未知词
4. Variable-Length Word Encodings for Neural Translation Models [C] . Rohan Chitnis, John DeNero Conference on empirical methods in natural language processing . 2015

机译：神经翻译模型的可变长度编码
5. Animals of North America, Selected Translations, and the Critical Afterword, "Extending Relational Field Theories: A Correlationist Model of Poetics & Translation" [D] . Rock, Martin 2018

机译：北美的动物，选定的翻译和致命之后，“延伸关系田间理论：诗学与翻译的相关性主义模型”
6. Neural correlates of the episodic encoding of pictures and words [O] . Cheryl L. Grady, Anthony R. McIntosh, M. Natasha Rajah, 1998

机译：图片和文字的情景编码的神经相关
7. Variable-Length Word Encodings for Neural Translation Models [O] . Rohan Chitnis, John Denero 2015

机译：神经翻译模型的可变长度字编码

Variable-Length Word Encodings for Neural Translation Models

摘要

著录项

相似文献

相关主题

期刊订阅