Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems

机译：通过FineTuning子字系统向合理的字符级变压器NMT迈向

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train. These problems can be partially overcome by incorporating a segmentation into tokens in the model. We show that by initially training a subword model and then finetuning it on characters, we can obtain a neural machine translation model that works at the character level without requiring token segmentation. We use only the vanilla 6-layer Transformer Base architecture. Our character-level models better capture morphological phenomena and show more robustness to noise at the expense of somewhat worse overall translation quality. Our study is a significant step towards high-performance and easy to train character-based models that are not extremely large.

机译：在字符级别应用变压器架构通常需要非常深的架构，难以训练。通过将分段结合到模型中的令牌中可以部分地克服这些问题。我们展示通过最初训练子字模型然后将其训练在字符上，我们可以获得一个在不需要令牌分割的字符级工作的神经机翻译模型。我们仅使用Vanilla 6层变压器基础架构。我们的性格模型更好地捕获形态现象，并以牺牲总体翻译质量的牺牲更糟糕的噪音表现出更多的稳健性。我们的研究是迈向高性能且易于培训基于角色的型号的重要一步，这些模型并不极大。

著录项

来源
《Conference on Empirical Methods in Natural Language Processing》|2020年|2572-2579|共8页
会议地点
作者
Jindrich Libovicky; Alexander Fraser;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation [J] . Yen-Pin Chen, Yi-Ying Chen, Jr-Jiun Lin, JMIR Medical Informatics . 2020,第4期

机译：基于角色级标记的医院信息系统改造了来自变压器的双向编码器表示（alphabert）：开发与绩效评估
2. Improved image captioning with subword units training and transformer [J] . Cai Qiang, Li Jing, Li Haisheng, 高技术通讯（英文版） . 2020,第002期

机译：利用子字单元培训和变压器改进图像标题
3. Construction of consistency judgment system of diploma policy and curriculum policy using character-level CNN [J] . Kazuteru Miyazaki, Masaaki Ida Electronics and communications in Japan . 2019,第12期

机译：基于字符级CNN的文凭政策与课程政策一致性判断系统的构建
4. Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation [C] . Xin Liu, Baosong Yang, Dayiheng Liu, International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics . 2021

机译：在Pretrain-Finetune范式的桥接子字差距为自然语言生成
5. High-Frequency Transformer Design for Solid-State Transformers in Electric Power Distribution Systems. [D] . Montoya, Roderick Javier Garcia. 2015

机译：配电系统中固态变压器的高频变压器设计。
6. Comparative Investigation on the Performance of Modified System Poles and Traditional System Poles Obtained from PDC Data for Diagnosing the Ageing Condition of Transformer Polymer Insulation Materials [O] . Jiefeng Liu, Hanbo Zheng, Yiyi Zhang, 2018

机译：从PDC数据获取的用于诊断变压器聚合物绝缘材料老化状况的修正系统极与传统系统极性能的比较研究。
7. Transfer Learning for Digital Heritage Collections: Comparing Neural Machine Translation at the Subword-level and Character-level [O] . Nikolay Banar, Karine Lasaracina, Walter Daelemans, 2020

机译：转移数字遗产集合：将神经机器翻译与子字级和字符级进行比较
8. AFRL MITLL WMT16 News Translation Task System: We put NMT in your MT Rescoring so you can MT while you MT. [R] . Salesky, E. E., Kazi, M. M., Thompson, B. J., 2016

机译：aFRL mITLL WmT16新闻翻译任务系统：我们将NmT放入您的mT Rescoring中，这样您就可以在mT时使用mT。

Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems

摘要

著录项

相似文献

相关主题

期刊订阅