首页> 中文期刊>计算机应用研究 >一种基于主题词集的自动文摘方法

一种基于主题词集的自动文摘方法

     

摘要

This paper proposed an automatic summarization method based on thematic tern set for automatic extracting abstracts from Chinese documents.According to the extracted thematic term set, the method calculated the sentence weights by the weights of the thematic terms, then got the corresponding total weight of each sentence, and selected several sentences with higher weight by percentage, and finally, output the summarization sentences by original order.Experiments were conducted on HIT IR-lab text summarization corpus, and utilized intrinsic automatic evaluation measures to evaluate the performance of the proposed method.Experimental results show that the proposed method achieves 66.07% upon the F-measure, which suggests it can generate higher quality summarization, nearly to the reference abstract, achieving very good performance.%提出一种基于主题词集的文本自动文摘方法,用于自动提取文档文摘.该方法根据提取到的主题词集,由主题词权重进行加权计算各主题词所在的句子权重,从而得出主题词集对应的每个句子的总权重,再根据自动文摘比例选取句子权重较大的几个句子,最后按原文顺序输出文摘.实验在哈工大信息检索研究室单文档自动文摘语料库上进行,使用内部评测自动评估方法对获得的文摘进行评价,总体F值达到了66.07%.实验结果表明,该方法所获得的文摘质量高,较接近于参考文摘,取得了良好的效果.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号