【24h】

Long-Span Summarization via Local Attention and Content Selection

机译:通过当地关注和内容选择的长期概括

获取原文

摘要

Transformer-based models have achieved state-of-the-art results in a wide range of natural language processing (NLP) tasks including document summarization. Typically these systems are trained by fine-tuning a large pre-trained model to the target task. One issue with these transformer-based models is that they do not scale well in terms of memory and compute requirements as the input length grows. Thus, for long document summarization, it can be challenging to train or fine-tune these models. In this work, we exploit large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection. These approaches are compared on a range of network configurations. Experiments are carried out on standard long-span summarization tasks, including Spotify Pod-cast, arXiv, and PubMed datasets. We demonstrate that by combining these methods, we can achieve state-of-the-art results on all three tasks in the ROUGE scores. Moreover, without a large-scale GPU card, our approach can achieve comparable or better results than existing approaches.
机译:基于变压器的模型已经实现了最先进的,导致各种自然语言处理(NLP)任务包括文件摘要。通常,这些系统通过微调大型预先训练模型来训练到目标任务。这些基于变换器的模型的一个问题是它们在内存方面不符扩展,并且计算要求随着输入长度的增长。因此,对于长期的文件摘要来说,训练或微调这些模型可能会挑战。在这项工作中,我们利用大型预训练的基于变压器的模型,并使用两种方法进行抽象摘要解决长期依赖性:局部自我关注;并显式的内容选择。这些方法在一系列网络配置上进行了比较。实验是在标准的长期概要任务上进行的,包括Spotify Pod-Cast,Arxiv和PubMed数据集。我们证明,通过组合这些方法,我们可以在胭脂分数中的所有三个任务中实现最先进的结果。此外,没有大规模的GPU卡,我们的方法可以实现比现有方法相当或更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号