Transfer Learning for Sequence Generation: from Single-source to Multi-source

机译：转移序列生成学习：从单源到多源

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-source sequence generation (MSG) is an important kind of sequence generation tasks that takes multiple sources, including automatic post-editing, multi-source translation, multi-document summarization, etc. As MSG tasks suffer from the data scarcity problem and recent pretrained models have been proven to be effective for low-resource downstream tasks, transferring pretrained sequence-to-sequencc models to MSG tasks is essential. Although directly finetuning pretrained models on MSG tasks and concatenating multiple sources into a single long sequence is regarded as a simple method to transfer pretrained models to MSG tasks, we conjecture that the direct finetuning method leads to catastrophic forgetting and solely relying on pretrained self-attention layers to capture cross-source information is not sufficient. Therefore, we propose a two-stage finetuning method to alleviate the pretrain-finetune discrepancy and introduce a novel MSG model with a fine encoder to learn better representations in MSG tasks. Experiments show that our approach achieves new state-of-the-art results on the WMT17 APE task and multi-source translation task using the WMT14 test set. When adapted to document-level translation, our framework outperforms strong baselines significantly.

机译：多源序列生成（MSG）是具有多个源的重要序列生成任务，包括自动编辑，多源转换，多文件摘要等，因为MSG任务遭受数据稀缺问题和最近的已经证明预磨模的模型对于低资源下游任务有效，将预先训练的序列到序列式模型转移到MSG任务至关重要。虽然直接在MSG任务上进行了预付借预热模型，但将多个源连接到单个长序列中被视为将掠夺模型转移到MSG任务的简单方法，但我们猜想直接的FineTuning方法导致灾难性的遗忘，并仅仅依赖于预磨损的自我关注捕获跨源信息的层是不够的。因此，我们提出了一种两级的FineTuning方法来缓解预rain-Finetune差异，并引入具有精细编码器的新型MSG模型，以了解MSG任务中更好的表示。实验表明，我们的方法在WMT17 APE任务和多源翻译任务上实现了新的最先进的结果，使用WMT14测试集。当适应文件级翻译时，我们的框架显着优于强大的基线。

著录项

来源
《International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics》|2021年|5738-5750|共13页
会议地点
作者
Xuancheng Huang; Jingfang Xu; Maosong Sun; Yang Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder [J] . Wu Jie, Wu Yingbo, Niu Nan, Software Quality Journal . 2021,第2期

机译：MHCPDP：多源异构交叉项目缺陷预测通过多源传输学习和AutoEncoder
2. Multi-source transfer learning network to complement knowledge for intelligent diagnosis of machines with unseen faults [J] . Bin Yang, Songci Xu, Yaguo Lei, Mechanical systems and signal processing . 2022,第Jana期

机译：多源转移学习网络，以便与看不见的故障智能诊断的知识
3. A transfer learning model with multi-source domains for biomedical event trigger extraction [J] . Yifei Chen BMC Genomics . 2021,第1期

机译：具有生物医学事件触发提取的多源域的传输学习模型
4. Attention Strategies for Multi-Source Sequence-to-Sequence Learning [C] . Jindrich Libovicky, Jindrich Helcl Annual meeting of the Association for Computational Linguistics . 2017

机译：多源序列到序列学习的注意力策略
5. Multi-Source Text Generation and Beyond Using Reinforcement Learning [D] . Cho, Woon Sang. 2021

机译：多源文本生成及超出使用钢筋学习
6. A transfer learning model with multi-source domains for biomedical event trigger extraction [O] . Yifei Chen 2021

机译：生物医学事件触发提取多源域的传输学习模型
7. Attention Strategies for Multi-Source Sequence-to-Sequence Learning [O] . Libovický, Jindřich, Helcl, Jindřich 2017

机译：多源序列到序列学习的注意策略

Transfer Learning for Sequence Generation: from Single-source to Multi-source

摘要

著录项

相似文献

相关主题

期刊订阅