...
首页> 外文期刊>Information Processing & Management >Topic discovery based on text mining techniques
【24h】

Topic discovery based on text mining techniques

机译:基于文本挖掘技术的主题发现

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool.
机译:在本文中,我们提出了一个主题发现系统,旨在揭示新闻流中存在的隐式知识。此知识表示为主题/子主题的层次结构,其中每个主题都包含与之相关的文档集以及从这些文档中提取的摘要。这样构建的摘要对于从生成的层次结构浏览和选择感​​兴趣的主题很有用。我们的建议包括一个新的增量式层次聚类算法,该算法结合了分区方法和凝聚方法,并从中受益匪浅。最后,提出了一种基于测试者理论的摘要方法来构建主题摘要。 TDT2集合中的实验结果不仅证明了其作为主题检测系统的有效性,而且还作为分类和汇总工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号