首页> 外文会议>New frontiers in artificial intelligence >A Hierarchical Neural Extractive Summarizer for Academic Papers
【24h】

A Hierarchical Neural Extractive Summarizer for Academic Papers

机译:学术论文的分层神经抽取摘要器

获取原文
获取原文并翻译 | 示例

摘要

Recent neural network-based models have proven successful in summarization tasks. However, previous studies mostly focus on comparatively short texts and it is still challenging for neural models to summarize long documents such as academic papers. Because of their large size, summarization for academic papers has two obstacles: it is hard for a recurrent neural network (RNN) to squash all the information on the source document into a latent vector, and it is simply difficult to pinpoint a few correct sentences among a large number of sentences. In this paper, we present an extractive summarizer for academic papers. The idea is converting a paper into a tree structure composed of nodes corresponding to sections, paragraphs, and sentences. First, we build a hierarchical encoder-decoder model based on the tree. This design eases the load on the RNNs and enables us to effectively obtain vectors that represent paragraphs and sections. Second, we propose a tree structure-based scoring method to steer our model toward correct sentences, which also helps the model to avoid selecting irrelevant sentences. We collect academic papers available from PubMed Central, and build the training data suited for supervised machine learning-based extractive summarization. Our experimental results show that the proposed model outperforms several baselines and reduces high-impact errors.
机译:事实证明,最近的基于神经网络的模型在汇总任务中很成功。但是,以前的研究大多集中在相对较短的文本上,而神经模型总结诸如学术论文之类的较长文档仍然具有挑战性。由于篇幅较大,学术论文的摘要有两个障碍:递归神经网络(RNN)很难将源文档中的所有信息压缩为潜在向量,并且很难准确地指出一些正确的句子在大量的句子中。在本文中,我们为学术论文提供摘录摘要。想法是将论文转换为树结构,该树结构由与节,段落和句子相对应的节点组成。首先,我们基于树构建分层的编码器-解码器模型。这种设计减轻了RNN的负担,并使我们能够有效地获得代表段落和节段的向量。其次,我们提出了一种基于树结构的评分方法,将我们的模型引向正确的句子,这也有助于模型避免选择不相关的句子。我们收集可从PubMed Central获得的学术论文,并构建适合于基于监督的机器学习的提取摘要的培训数据。我们的实验结果表明,所提出的模型优于几个基线并减少了高影响力误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号