首页> 外文期刊>Information Processing & Management >SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis
【24h】

SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis

机译:SRL-esa-textsum:一种基于语义角色标记和显式语义分析的文本摘要方法

获取原文
获取原文并翻译 | 示例

摘要

Automatic text summarization attempts to provide an effective solution to today's unprecedented growth of textual data. This paper proposes an innovative graph-based text summarization framework for generic single and multi document summarization. The summarizer benefits from two well-established text semantic representation techniques; Semantic Role Labelling (SRL) and Explicit Semantic Analysis (ESA) as well as the constantly evolving collective human knowledge in Wikipedia. The SRL is used to achieve sentence semantic parsing whose word tokens are represented as a vector of weighted Wikipedia concepts using ESA method. The essence of the developed framework is to construct a unique concept graph representation underpinned by semantic role-based multi-node (under sentence level) vertices for summarization. We have empirically evaluated the summarization system using the standard publicly available dataset from Document Understanding Conference 2002 (DUC 2002). Experimental results indicate that the proposed summarizer outperforms all state-of-the-art related comparators in the single document summarization based on the ROUGE-1 and ROUGE-2 measures, while also ranking second in the ROUGE-1 and ROUGE-SU4 scores for the multi-document summarization. On the other hand, the testing also demonstrates the scalability of the system, i.e., varying the evaluation data size is shown to have little impact on the summarizer performance, particularly for the single document summarization task. In a nutshell, the findings demonstrate the power of the role-based and vectorial semantic representation when combined with the crowd-sourced knowledge base in Wikipedia.
机译:自动文本摘要尝试为今天的前所未有的文本数据增长提供有效的解决方案。本文提出了一种基于创新的图形文本摘要框架,用于通用单一和多文件摘要。摘要来自两个完整的文本语义表示技术;语义角色标记(SRL)和明确的语义分析(ESA)以及维基百科的不断发展的集体知识。 SRL用于实现句子语义解析,其单词令牌使用ESA方法表示为加权维基百科概念的向量。发达框架的本质是通过基于语义角色的多节点(句子级别)顶点构建独特的概念图表示,用于汇总。我们通过文档了解会大会2002(DUC 2002)的标准公共可用数据集进行了经验评估了摘要系统。实验结果表明,拟议的总结器在基于Rouge-1和Rouge-2措施的单一文件摘要中表现出所有最先进的相关比较器,同时还在胭脂-1和胭脂 - SU4分数中排名第二多文件摘要。另一方面,测试还展示了系统的可扩展性,即,改变评估数据大小显示对摘要序列性能没有什么影响,特别是对于单一文件摘要任务。简而言之,调查结果表明,与维基百科的人群源知识库相结合时,表现出基于角色和矢量语义表示的力量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号