【24h】

Recent automatic text summarization techniques: a survey

机译:最近的自动文本摘要技术:调查

获取原文
获取外文期刊封面目录资料

摘要

As information is available in abundance for every topic on internet, condensing the important information in the form of summary would benefit a number of users. Hence, there is growing interest among the research community for developing new approaches to automatically summarize the text. Automatic text summarization system generates a summary, i.e. short length text that includes all the important information of the document. Since the advent of text summarization in 1950s, researchers have been trying to improve techniques for generating summaries so that machine generated summary matches with the human made summary. Summary can be generated through extractive as well as abstractive methods. Abstractive methods are highly complex as they need extensive natural language processing. Therefore, research community is focusing more on extractive summaries, trying to achieve more coherent and meaningful summaries. During a decade, several extractive approaches have been developed for automatic summary generation that implements a number of machine learning and optimization techniques. This paper presents a comprehensive survey of recent text summarization extractive approaches developed in the last decade. Their needs are identified and their advantages and disadvantages are listed in a comparative manner. A few abstractive and multilingual text summarization approaches are also covered. Summary evaluation is another challenging issue in this research field. Therefore, intrinsic as well as extrinsic both the methods of summary evaluation are described in detail along with text summarization evaluation conferences and workshops. Furthermore, evaluation results of extractive summarization approaches are presented on some shared DUC datasets. Finally this paper concludes with the discussion of useful future directions that can help researchers to identify areas where further research is needed.
机译:随着信息对互联网上的每个主题都有丰富的信息,以摘要的形式缩小重要信息将使许多用户受益。因此,研究界越来越感兴趣,以开发新方法以自动总结文本。自动文本摘要系统生成摘要,即包括文档的所有重要信息的短长度文本。自20世纪50年代的文本摘要出现以来,研究人员一直在努力提高用于产生摘要的技术,以便机器生成的摘要与人类摘要匹配。摘要可以通过提取和抽象方法来生成。抽象方法非常复杂,因为它们需要广泛的自然语言处理。因此,研究界正在致力于提取摘要,试图实现更加连贯性和有意义的摘要。在十年中,已经开发了几种采掘方法,用于实现多种机器学习和优化技术的自动摘要。本文介绍了对最近十年开发的最近文本摘要采掘方法的全面调查。他们的需求被确定,它们的优点和缺点以比较方式列出。还涵盖了一些抽象和多语言文本摘要方法。摘要评估是该研究领域的另一个具有挑战性的问题。因此,内在的以及外在的两种摘要评估方法以及文本摘要评估会议和研讨会进行了详细描述。此外,在一些共享DUC数据集中介绍了提取摘要方法的评估结果。最后,本文讨论了对有用的未来方向,可以帮助研究人员识别需要进一步研究的领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号