首页> 外文会议>Knowledge Discovery and Data Mining, 2010. WKDD '10 >Topic Extraction for a Large Document Set with the Topic Integration
【24h】

Topic Extraction for a Large Document Set with the Topic Integration

机译:具有主题集成的大型文档集的主题提取

获取原文
获取外文期刊封面目录资料

摘要

We propose here a method to extract topics from a large document set with topic integration from some small document sets. In order to extract topics, the Non-negative Matrix Factorization (NMF) is applied to document sets. It is useful to integrate the topics from some small document sets since the procedure of topic extraction with the NMF from a large document set takes a long time if the number of documents is large. In this paper, we have shortened the procedure time for the topic extraction from a large document set with the integration of topics extracted from respective some small document sets. In addition, an evaluation of our proposed method has been carried out with the compatibility of topics between the integrated topics and the topics from the large document set by the NMF directly, and the procedure times of the NMF.
机译:我们在这里提出一种从大型文档集中提取主题的方法,并从一些小型文档集中进行主题集成。为了提取主题,将非负矩阵分解(NMF)应用于文档集。集成一些小型文档集中的主题很有用,因为如果文档数量很大,则使用大型文档集中的NMF进行主题提取的过程将花费很长时间。在本文中,通过集成从各个小型文档集中提取的主题,我们缩短了从大型文档集中提取主题的过程时间。此外,对我们提出的方法进行了评估,评估的主题是综合主题与NMF直接设置的大型文档中的主题之间的兼容性,以及NMF的处理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号