...
首页> 外文期刊>Knowledge-Based Systems >Exploring events and distributed representations of text in multi-document summarization
【24h】

Exploring events and distributed representations of text in multi-document summarization

机译:在多文档摘要中探索事件和文本的分布式表示形式

获取原文
获取原文并翻译 | 示例

摘要

In this article, we explore an event detection framework to improve multi-document summarization. Our approach is based on a two-stage single-document method that extracts a collection of key phrases, which are then used in a centrality-as-relevance passage retrieval model. We explore how to adapt this single document method for multi-document summarization methods that are able to use event information. The event detection method is based on Fuzzy Fingerprint, which is a supervised method trained on documents with annotated event tags. To cope with the possible usage of different terms to describe the same event, we explore distributed representations of text in the form of word embeddings, which contributed to improve the summarization results. The proposed summarization methods are based on the hierarchical combination of single-document summaries. The automatic evaluation and human study performed show that these methods improve upon current state-of-the-art multi-document summarization systems on two mainstream evaluation datasets, DUC 2007 and TAC 2009. We show a relative improvement in ROUGE-1 scores of 16% for TAC 2009 and of 17% for DUC 2007. (C) 2015 Published by Elsevier B.V.
机译:在本文中,我们探索了一个事件检测框架来改进多文档摘要。我们的方法基于两阶段的单文档方法,该方法提取了关键短语的集合,然后将其用于中心相关性段落检索模型。我们探索如何使这种单文档方法适用于能够使用事件信息的多文档摘要方法。事件检测方法基于模糊指纹,这是对带有注释事件标签的文档进行训练的有监督方法。为了应付使用不同术语描述同一事件的可能,我们以词嵌入的形式探索了文本的分布式表示形式,这有助于改善汇总结果。所提出的摘要方法是基于单文档摘要的分层组合。自动评估和人工研究表明,这些方法在两个主流评估数据集DUC 2007和TAC 2009上改进了当前最新的多文档摘要系统。我们在ROUGE-1得分上得到了16的相对改进。 TAC 2009和DUC 2007的百分比为17%。(C)2015由Elsevier BV发布

著录项

  • 来源
    《Knowledge-Based Systems》 |2016年第15期|33-42|共10页
  • 作者单位

    INESC ID Lisboa, Rua Alves Redol 9, P-1000029 Lisbon, Portugal|Univ Lisbon, Inst Super Tecn, Ave Rovisco Pais 1, P-1049001 Lisbon, Portugal|Carnegie Mellon Univ, Language Technol Inst, 5000 Forbes Ave, Pittsburgh, PA 15213 USA;

    INESC ID Lisboa, Rua Alves Redol 9, P-1000029 Lisbon, Portugal|Univ Lisbon, Inst Super Tecn, Ave Rovisco Pais 1, P-1049001 Lisbon, Portugal|Carnegie Mellon Univ, Language Technol Inst, 5000 Forbes Ave, Pittsburgh, PA 15213 USA;

    INESC ID Lisboa, Rua Alves Redol 9, P-1000029 Lisbon, Portugal|Inst Univ Lisboa ISCTE IUL, Ave Forcas Armadas, P-1649026 Lisbon, Portugal;

    Carnegie Mellon Univ, Language Technol Inst, 5000 Forbes Ave, Pittsburgh, PA 15213 USA;

    Carnegie Mellon Univ, Language Technol Inst, 5000 Forbes Ave, Pittsburgh, PA 15213 USA;

    INESC ID Lisboa, Rua Alves Redol 9, P-1000029 Lisbon, Portugal|Univ Lisbon, Inst Super Tecn, Ave Rovisco Pais 1, P-1049001 Lisbon, Portugal;

    INESC ID Lisboa, Rua Alves Redol 9, P-1000029 Lisbon, Portugal|Univ Lisbon, Inst Super Tecn, Ave Rovisco Pais 1, P-1049001 Lisbon, Portugal;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Multi-document summarization; Extractive summarization; Event detection; Distributed representations of text;

    机译:多文档摘要;摘要摘要;事件检测;文本的分布式表示;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号