Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders

机译：使用共享编码器和特定语言解码器的多语种无监督NMT

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a multilingual unsupervised NMT scheme which jointly trains multiple languages with a shared encoder and multiple decoders. Our approach is based on denoising autoencoding of each language and back-translating between English and multiple non-English languages. This results in a universal encoder which can encode any language participating in training into an interlingual representation, and language-specific decoders. Our experiments using only monolingual corpora show that multilingual unsupervised model performs better than the separately trained bilingual models achieving improvement of up to 1.48 BLEU points on WMT test sets. We also observe that even if we do not train the network for all possible translation directions, the network is still able to translate in a many-to-many fashion leveraging encoder's ability to generate interlingual representation.

机译：在本文中，我们提出了一种多语言无监督的NMT计划，该方案与共享编码器和多个解码器共同列举多种语言。我们的方法是基于每种语言的去噪和返回翻译的英语和多种非英语之间的返回翻译。这导致通用编码器，可以将参与培训的任何语言编码为间隔表示，以及特定于语言的解码器。我们的实验仅使用单语语料库表明，多语言无监督模型比在WMT测试组上实现高达1.48的BLEU点的分别培训双语模型更好地表现更好。我们还观察到即使我们没有为所有可能的翻译方向训练网络，网络仍然能够在利用编码器生成间歇表示的能力中转换多对多的时尚。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|cxxxiv p. 2639-3296|共7页
会议地点
作者
Sukanta Sen; Kamal Kumar Gupta; Asif Ekbal; Pushpak Bhattacharyya;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. STED-Net: Self-taught encoder-decoder network for unsupervised feature representation [J] . Songlin Du, Takeshi Ikenaga Multimedia Tools and Applications . 2021,第3期

机译：STED NET：用于无监督特征表示的自学编码器 - 解码器网络
2. BranchGAN: Unsupervised Mutual Image-to-Image Transfer With A Single Encoder and Dual Decoders [J] . Zhou Yi-Fan, Jiang Run-Hao, Wu Xiao, IEEE transactions on multimedia . 2019,第12期

机译：BranchGAN：使用单个编码器和双解码器的无监督图像间互传
3. Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex [J] . Han Kuan, Wen Haiguang, Shi Junxing, NeuroImage . 2019,第期

机译：变形AutoEncoder：在Visual Cortex中编码和解码FMRI活动的无监督模型
4. Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders [C] . Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Annual meeting of the Association for Computational Linguistics . 2019

机译：使用共享编码器和特定于语言的解码器的多语言无监督NMT
5. Linear interactive encoding and decoding schemes for lossless source coding with decoder only side information. [D] . Meng, Jin. 2008

机译：仅使用解码器附带信息的无损源编码的线性交互式编码和解码方案。
6. Cross-linguistically shared and language-specific sound symbolism in novel words elicited by locomotion videos in Japanese and English [O] . Noburo Saji, Kimi Akita, Katerina Kantartzis, 2012

机译：日语和英语中的运动视频引发的新颖单词中的跨语言共享和特定于语言的声音象征
7. Figure 3: The encoder architecture of LASER encoder decoder system for multilingual sentence embedding (Artetxe Schwenk, 2019). [O] . -1

机译：图3：用于多语种句子嵌入的激光编码器解码系统的编码器体系结构（Artetxe＆Schwenk，2019）。

Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders

摘要

著录项

相似文献

相关主题

期刊订阅