首页> 外文期刊>Knowledge-Based Systems >Exploiting relevance, coverage, and novelty for query-focused multi-document summarization
【24h】

Exploiting relevance, coverage, and novelty for query-focused multi-document summarization

机译:利用相关性,覆盖范围和新颖性以查询为中心的多文档摘要

获取原文
获取原文并翻译 | 示例

摘要

Summarization plays an increasingly important role with the exponential document growth on the Web. Specifically, for query-focused summarization, there exist three challenges: (1) how to retrieve query relevant sentences; (2) how to concisely cover the main aspects (i.e., topics) in the document; and (3) how to balance these two requests. Specially for the issue relevance, many traditional summarization techniques assume that there is independent relevance between sentences, which may not hold in reality. In this paper, we go beyond this assumption and propose a novel Probabilistic-modeling Relevance, Coverage, and Novelty (PRCN) framework, which exploits a reference topic model incorporating user query for dependent relevance measurement. Along this line, topic coverage is also modeled under our framework. To further address the issues above, various sentence features regarding relevance and novelty are constructed as features, while moderate topic coverage are maintained through a greedy algorithm for topic balance. Finally, experiments on DUC2005 and DUC2006 datasets validate the effectiveness of the proposed method.
机译:随着Web上指数级文档的增长,摘要起着越来越重要的作用。具体而言,对于以查询为中心的摘要,存在三个挑战:(1)如何检索与查询相关的句子; (2)如何简洁地涵盖文件的主要方面(即主题); (3)如何平衡这两个请求。特别是对于问题的相关性,许多传统的摘要技术都假定句子之间存在独立的相关性,这在现实中可能并不成立。在本文中,我们超越了这一假设,并提出了一个新颖的概率模型相关性,覆盖范围和新颖性(PRCN)框架,该框架利用了结合用户查询的参考主题模型来进行相关性相关性度量。沿此思路,主题覆盖范围也建立在我们的框架下。为了进一步解决上述问题,将与相关性和新颖性有关的各种句子特征构造为特征,同时通过贪婪算法来保持适度的话题覆盖以实现话题平衡。最后,在DUC2005和DUC2006数据集上的实验验证了该方法的有效性。

著录项

  • 来源
    《Knowledge-Based Systems》 |2013年第7期|33-42|共10页
  • 作者单位

    The Key Laboratory of Intelligent information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China,University of Chinese Academy of Sciences, Beijing 100049, China;

    The Key Laboratory of Intelligent information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;

    The Key Laboratory of Intelligent information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;

    The Key Laboratory of Intelligent information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Query-focused document summarization; Dependent relevance; Coverage; Novelty; PHITS;

    机译:注重查询的文档摘要;相依相关性;覆盖范围新颖性照片;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号