Quantifying Appropriateness of Summarization Data for Curriculum Learning

机译：量化课程学习摘要数据的适当性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Much research has reported the training data of summarization models are noisy; summaries often do not reflect what is written in the source texts. We propose an effective method of curriculum learning to train summarization models from such noisy data. Curriculum learning is used to train sequence-to-sequence models with noisy data. In translation tasks, previous research quantified noise of the training data using two models trained with noisy and clean corpora. Because such corpora do not exist in summarization fields, we propose a model that can quantify noise from a single noisy corpus. We conduct experiments on three summarization models; one pretrained model and two non-pretrained models, and verify our method improves the performance. Furthermore, we analyze how different curricula affect the performance of pretrained and non-pretrained summarization models. Our result on human evaluation also shows our method improves the performance of summarization models.

机译：许多研究报告报告摘要模型的培训数据是嘈杂的; 摘要通常不会反映源文本中写入的内容。我们提出了一种有效的课程学习方法，从而从这种嘈杂的数据训练摘要模型。课程学习用于使用嘈杂的数据训练序列到序列模型。在翻译任务中，先前的研究使用嘈杂和清洁基层训练的两种型号量化训练数据的噪声。由于此类语料库不存在于摘要字段中，因此我们提出了一种可以量化来自单个嘈杂语料库的噪声的模型。我们在三种摘要模型进行实验; 一个佩带的模型和两个非预磨损的模型，并验证我们的方法提高了性能。此外，我们分析了不同的课程如何影响预磨料和非预借鉴摘要模型的性能。我们对人类评估的结果还显示了我们的方法提高了摘要模型的性能。

著录项

来源
《Conference of the European Chapter of the Association for Computational Linguistics》|2021年|1395-1405|共11页
会议地点
作者
Ryuji Kano; Takumi Takahashi; Toru Nishino; Motoki Taniguchi; Tomoki Taniguchi; Tomoko Ohkuma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Uncertain One-Class Learning and Concept Summarization Learning on Uncertain Data Streams [J] . Liu Bo, Xiao Yanshan, Yu Philip S., IEEE Transactions on Knowledge and Data Engineering . 2014,第2期

机译：不确定数据流上的不确定一类学习和概念总结学习
2. Corrigendum to “Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics” [J] . Atif Khan, Muhammad Adnan Gul, M. Irfan Uddin, Scientific programming . 2021,第a期

机译：“汇总在线电影评论：大数据分析的机器学习方法”勘误
3. Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges [J] . Dima Suleiman, Arafat Awajan Mathematical Problems in Engineering: Theory, Methods and Applications . 2020,第1期

机译：基于深度学习的抽象文本摘要：方法，数据集，评估措施和挑战
4. Learning Analytics for Course Management in Computer Science Curriculum -- A Novel Visualization and Summarization Approach [C] . Sirisha Velampalli IEEE International Conference on Technology for Education . 2015

机译：计算机科学课程中课程管理的学习分析-一种新颖的可视化和汇总方法
5. Machine Learning Methods for Quantification of Depression Severity and Prediction of Recovery Trajectory Using Longitudinal Video and Audio Data, with Applications to Deep Brain Stimulation Treatment Optimization [D] . Harati, Sahar. 2019

机译：机器学习方法，用于量化抑郁症严重性和恢复轨迹预测使用纵向视频和音频数据，应用于深脑刺激处理优化
6. Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods [O] . Matthew C. Hancock, Jerry F. Magnan 2016

机译：仅使用放射科医生量化的图像特征作为统计学习算法的输入的肺结节恶性分类：使用两种统计学习方法探查肺图像数据库联盟数据集
7. Leverage Unlabeled Data for Abstractive Speech Summarization with Self-supervised Learning and Back-Summarization [O] . Paul Tardy, Louis de Seynes, François Hernandez, 2020

机译：利用自我监督学习和回序的抽象语音摘要利用未标记的数据

Quantifying Appropriateness of Summarization Data for Curriculum Learning

摘要

著录项

相似文献

相关主题

期刊订阅