首页> 外文OA文献 >Design and development of a concept-based multi-document summarization system for research abstracts
【2h】

Design and development of a concept-based multi-document summarization system for research abstracts

机译:基于概念的研究摘要多文档摘要系统的设计与开发

摘要

This paper describes a new concept-based multi-document summarization system that employs discourse parsing, information extraction and information integration. Dissertation abstracts in the field of sociology were selected as sample documents for this study. The summarization process includes four major steps — (1) parsing dissertation abstracts into five standard sections; (2) extracting research concepts (often operationalized as research variables) and their relationships, the research methods used and the contextual relations from specific sections of the text; (3) integrating similar concepts and relationships across different abstracts; and (4) combining and organizing the different kinds of information using a variable-based framework, and presenting them in an interactive web-based interface. The accuracy of each summarization step was evaluated by comparing the system-generated output against human coding. The user evaluation carried out in the study indicated that the majority of subjects (70%) preferred the concept-based summaries generated using the system to the sentence-based summaries generated using traditional sentence extraction techniques.
机译:本文介绍了一种新的基于概念的多文档摘要系统,该系统采用了语篇解析,信息提取和信息集成的功能。选择社会学领域的论文摘要作为本研究的样本文件。总结过程包括四个主要步骤:(1)将学位论文摘要解析为五个标准部分; (2)从文本的特定部分中提取研究概念(通常可作为研究变量使用)及其关系,使用的研究方法和上下文关系; (3)在不同的摘要中整合相似的概念和关系; (4)使用基于变量的框架组合和组织各种信息,并在基于Web的交互式界面中呈现它们。通过将系统生成的输出与人工编码进行比较,可以评估每个汇总步骤的准确性。在研究中进行的用户评估表明,大多数受试者(70%)更喜欢使用系统生成的基于概念的摘要,而不是使用传统句子提取技术生成的基于句子的摘要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号