首页> 中文期刊>中文信息学报 >一种基于LDA的CRF自动文摘方法

一种基于LDA的CRF自动文摘方法

     

摘要

该方法被认为能较好地对文本进行浅层语义建模.该文在前人工作基础上提出了基于LDA的条件随机场(Conditional Random Field, CRF)自动文摘(LCAS)方法,研究了LDA在有监督的单文档自动文摘中的作用,提出了将LDA提取的主题(Topic)作为特征加入CRF模型中进行训练的方法,并分析研究了在不同Topic下LDA对摘要结果的影响.实验结果表明,加入LDA特征后,能够有效地提高以传统特征为输入的CRF文摘系统的质量.%In recent years, Latent Dirichlet Allocation (LDA) has been widely applied in the document clustering, the text classification, the text segmentation, and even the query based multidocument summarization without supervision. LDA is recognized for its great power in modeling a document in a semantic way. In this paper we propose a new superivised method for the extractionbased single document summarization by adding LDA of the document as new features into a CRF summarization system. We study the power of LDA and analyze its different effects by changing the number of topics. Our experiments show that, by adding LDA features, the result of traditional CRF summarization system can be impressively increased.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号