首页> 外文期刊>International journal of parallel programming >Abstractive Text Summarization based on Improved Semantic Graph Approach
【24h】

Abstractive Text Summarization based on Improved Semantic Graph Approach

机译:基于改进语义图方法的抽象文本摘要

获取原文
获取原文并翻译 | 示例

摘要

The goal of abstractive summarization of multi-documents is to automatically produce a condensed version of the document text and maintain the significant information. Most of the graph-based extractive methods represent sentence as bag of words and utilize content similarity measure, which might fail to detect semantically equivalent redundant sentences. On other hand, graph based abstractive method depends on domain expert to build a semantic graph from manually created ontology, which requires time and effort. This work presents a semantic graph approach with improved ranking algorithm for abstractive summarization of multi-documents. The semantic graph is built from the source documents in a manner that the graph nodes denote the predicate argument structures (PASs)—the semantic structure of sentence, which is automatically identified by using semantic role labeling; while graph edges represent similarity weight, which is computed from PASs semantic similarity. In order to reflect the impact of both document and document set on PASs, the edge of semantic graph is further augmented with PAS-to-document and PAS-to-document set relationships. The important graph nodes (PASs) are ranked using the improved graph ranking algorithm. The redundant PASs are reduced by using maximal marginal relevance for re-ranking the PASs and finally summary sentences are generated from the top ranked PASs using language generation. Experiment of this research is accomplished using DUC-2002, a standard dataset for document summarization. Experimental findings signify that the proposed approach shows superior performance than other summarization approaches.
机译:多个文档的抽象摘要的目的是自动生成文档文本的精简版本并维护重要信息。大多数基于图的提取方法将句子表示为单词袋,并利用内容相似性度量,这可能无法检测到语义上等效的冗余句子。另一方面,基于图的抽象方法依赖领域专家从手动创建的本体构建语义图,这需要时间和精力。这项工作提出了一种改进的排序算法的语义图方法,用于对多文档进行抽象汇总。语义图是从源文档构建的,其方式是图节点表示谓词自变量结构(PAS),即句子的语义结构,该结构通过使用语义角色标记自动识别。图边缘代表相似度权重,它是根据PAS的语义相似度来计算的。为了反映文档和文档集对PAS的影响,语义图的边缘通过PAS到文档和PAS到文档集的关系得到了进一步增强。使用改进的图排名算法对重要图节点(PAS)进行排名。通过使用最大边际相关性对PAS进行重新排序来减少冗余PAS,最后使用语言生成从排名最高的PAS生成摘要语句。这项研究的实验是使用DUC-2002(用于文档汇总的标准数据集)完成的。实验结果表明,与其他摘要方法相比,该方法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号