首页> 外文会议>Workshop on Domain Adaptation for NLP >Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

【24h】

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

机译：用合成数据预先预先预测的近摄文本摘要模型的几次拍摄学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Interleaved texts, where posts belonging to different threads occur in a sequence, commonly occur in online chat posts, so that it can be time-consuming to quickly obtain an overview of the discussions. Existing systems first disentangle the posts by threads and then extract summaries from those threads. A major issue with such systems is error propagation from the disentanglement component. While end-to-end trainable summarization system could obviate explicit disentanglement, such systems require a large amount of labeled data. To address this, we propose to pretrain an end-to-end trainable hierarchical encoder-decoder system using synthetic interleaved texts. We show that by fine-tuning on a real-world meeting dataset (AMI), such a system out-performs a traditional two-step system by 22%. We also compare against transformer models and observed that pretraining with synthetic data both the encoder and decoder outperforms the BertSumExtAbs transformer model which pre-trains only the encoder on a large dataset.

机译：交错的文本，其中属于不同线程的帖子发生在序列中，通常在在线聊天帖子中出现，以便快速获取讨论的概述耗时。现有系统首先通过线程解除帖子，然后从这些线程中提取摘要。此类系统的主要问题是来自DisonDandlement组件的错误传播。虽然端到端培训摘要系统可以避免显式解剖，但这些系统需要大量标记数据。为了解决这个问题，我们建议使用合成交错文本来预先绘制端到端的培训分层编码器-解码器系统。我们表明，通过对真实世界会议数据集（AMI）进行微调，这样的系统会将传统的两步系统推出22％。我们还与变压器模型进行比较，并观察到使用合成数据的预先介绍编码器和解码器均优于BertsumExtabs变压器模型，该模型仅在大型数据集上仅预先列出编码器。

著录项

来源
《Workshop on Domain Adaptation for NLP》|2021年|245-254|共10页
会议地点
作者
Sanjeev Kumar Karn; Francine Chen; Yan-Ying Chen; Ulli Waltinger; Hinrich Schuetze;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges [J] . Dima Suleiman, Arafat Awajan Mathematical Problems in Engineering: Theory, Methods and Applications . 2020,第1期

机译：基于深度学习的抽象文本摘要：方法，数据集，评估措施和挑战
2. Evaluation of Unsupervised Learning based Extractive Text Summarization Technique for Large Scale Review and Feedback Data [J] . Jai Prakash Verma, Atul Patel Indian Journal of Science and Technology . 2017,第17期

机译：基于大规模学习和反馈数据的基于无监督学习的提取文本摘要技术的评估
3. Automatic Text Summarization Approaches to Speed up Topic Model Learning Process [J] . MOHAMED MORCHID, JUAN-MANUEL TORRES-MORENO, 2RICHARD DUFOUR, International journal of computational linguistics and applications . 2016,第2期

机译：自动文本摘要方法可加快主题模型的学习过程
4. Personalized Text Content Summarizer for Mobile Learning: An Automatic Text Summarization System with Relevance Based Language Model [C] . Yang Guangbing, Wen Dunwei, Kinshuk, 2012 IEEE Fourth International Conference on Technology for Education . 2012

机译：用于移动学习的个性化文本内容汇总器：具有基于相关性的语言模型的自动文本汇总系统
5. Evaluation of Synthetic Training Data and Training-Data-Augmentation Techniques for Object Detection in Ground-Penetrating Radar Data using Deep-Learning Models [D] . Ruggiero, Jean. 2021

机译：使用深度学习模型评估用于地面穿透雷达数据的对象检测的综合训练数据和训练数据增强技术
6. Inverse Biomechanical Modeling of the Tongue via Machine Learning and Synthetic Training Data [O] . Aniket A. Tolpadi, Maureen L. Stone, Aaron Carass, -1

机译：通过机器学习和综合训练数据对舌头进行逆向生物力学建模
7. Few-Shot Transfer Learning for Text Classification With Lightweight Word Embedding Based Models [O] . Chongyu Pan, Jian Huang, Jianxing Gong, 2019

机译：基于轻量级单词嵌入模型的微量单词的文本分类几次拍摄

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

摘要

著录项

相似文献

相关主题

期刊订阅