首页> 外文期刊>International Journal of Computational Intelligence and Applications >A PROBABILISTIC APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR GENERATING A TILED SUMMARY
【24h】

A PROBABILISTIC APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR GENERATING A TILED SUMMARY

机译:生成摘要的多文档摘要的概率方法

获取原文
获取原文并翻译 | 示例
       

摘要

Data availability is not a major issue at present times in view of the widespread use of Internet; however, information and knowledge availability are the issues. Due to data overload and time-critical nature of information need, automatic summarization of documents plays a significant role in information retrieval and text data mining. This paper discusses the design of a multi-document summarizer that uses Katz's K-mixture model for term distribution. The model helps in ranking the sentences by a modified term weight assignment. Highly ranked sentences are selected for the final summary. The sentences that are repetitive in nature are eliminated, and a tiled summary is produced. Our method avoids redundancy and produces a readable (even browsable) summary, which we refer to as an event-specific tiled summary. The system has been evaluated against the frequently occurring sentences in the summaries generated by a set of human subjects. Our system outperforms other auto-summarizers at different extraction levels of summarization with respect to the ideal summary, and is close to the ideal summary at 40% extraction level.
机译:鉴于Internet的广泛使用,当前数据可用性不是主要问题。但是,信息和知识的可用性是问题。由于数据过载和信息需求的时间紧迫性,文档的自动摘要在信息检索和文本数据挖掘中起着重要作用。本文讨论了使用Katz的K-mixture模型进行术语分配的多文档摘要器的设计。该模型有助于通过修改的术语权重分配对句子进行排名。选择高度排序的句子作为最终摘要。消除本质上重复的句子,并生成平铺的摘要。我们的方法避免了冗余,并生成了可读的(甚至是可浏览的)摘要,我们将其称为特定于事件的切片摘要。已针对一组人类受试者生成的摘要中的频繁出现的句子对系统进行了评估。就理想摘要而言,我们的系统在摘要的不同抽取级别上优于其他自动摘要器,并且在40%抽取级别上接近理想摘要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号