首页> 外文会议>JSAI International Symposium on Artificial Intelligence >A Hierarchical Neural Extractive Summarizer for Academic Papers
【24h】

A Hierarchical Neural Extractive Summarizer for Academic Papers

机译:学术论文的分层神经提取摘要

获取原文
获取外文期刊封面目录资料

摘要

Recent neural network-based models have proven successful in summarization tasks. However, previous studies mostly focus on comparatively short texts and it is still challenging for neural models to summarize long documents such as academic papers. Because of their large size, summarization for academic papers has two obstacles: it is hard for a recurrent neural network (RNN) to squash all the information on the source document into a latent vector, and it is simply difficult to pinpoint a few correct sentences among a large number of sentences. In this paper, we present an extractive summarizer for academic papers. The idea is converting a paper into a tree structure composed of nodes corresponding to sections, paragraphs, and sentences. First, we build a hierarchical encoder-decoder model based on the tree. This design eases the load on the RNNs and enables us to effectively obtain vectors that represent paragraphs and sections. Second, we propose a tree structure-based scoring method to steer our model toward correct sentences, which also helps the model to avoid selecting irrelevant sentences. We collect academic papers available from PubMed Central, and build the training data suited for supervised machine learning-based extractive summarization. Our experimental results show that the proposed model out-performs several baselines and reduces high-impact errors.
机译:最近的基于神经网络的模型已成功总结任务。然而,以前的研究大多专注于相对短的文本,并且神经模型仍然具有挑战性,总结了学术论文等长篇文献。由于它们的尺寸很大,总结学术论文有两个障碍:这是很难的回归神经网络(RNN)壁球源文档的所有信息为潜在向量,它仅仅是很难确定的几个正确的句子在大量的句子中。在本文中,我们为学术论文提出了一种提取摘要。该想法正在将纸张转换为由对应于部分,段落和句子的节点组成的树结构。首先,我们构建基于树的分层编码器解码器模型。该设计简化了RNN上的负载,使我们能够有效获得代表段落和部分的向量。其次,我们提出了一种基于树结构的评分方法来引导我们的模型朝向正确的句子,这也有助于模型避免选择无关的句子。我们收集来自PubMed Central的学术论纸,并建立适合受监督基于机器的提取总结的培训数据。我们的实验结果表明,所提出的模型出现了几个基线并减少了高冲击错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号