【24h】

Dynamic Pattern Mining: An Incremental Data Clustering Approach

机译:动态模式挖掘:一种增量数据聚类方法

获取原文
获取原文并翻译 | 示例

摘要

We propose a mining framework that supports the identification of useful patterns based on incremental data clustering. Given the popularity of Web news services, we focus our attention on news streams mining. News articles are retrieved from Web news services, and processed by data mining tools to produce useful higher-level knowledge, which is stored in a content description database. Instead of interacting with a Web news service directly, by exploiting the knowledge in the database, an information delivery agent can present an answer in response to a user request. A key challenging issue within news repository management is the high rate of document insertion. To address this problem, we present a sophisticated incremental hierarchical document clustering algorithm using a neighborhood search. The novelty of the proposed algorithm is the ability to identify meaningful patterns (e.g., news events, and news topics) while reducing the amount of computations by maintaining cluster structure incrementally. In addition, to overcome the lack of topical relations in conceptual ontologies, we propose a topic ontology learning framework that utilizes the obtained document hierarchy. Experimental results demonstrate that the proposed clustering algorithm produces high-quality clusters, and a topic ontology provides interpretations of news topics at different levels of abstraction.
机译:我们提出了一个挖掘框架,该框架支持基于增量数据聚类的有用模式识别。鉴于Web新闻服务的普及,我们将注意力集中在新闻流挖掘上。从Web新闻服务中检索新闻文章,并通过数据挖掘工具对其进行处理,以生成有用的高级知识,这些知识存储在内容描述数据库中。通过利用数据库中的知识,信息传递代理可以响应用户请求来提供答案,而不是直接与Web新闻服务进行交互。新闻存储库管理中的一个关键挑战性问题是文档插入率很高。为了解决这个问题,我们提出了一种使用邻域搜索的复杂的增量式分层文档聚类算法。所提出的算法的新颖性是能够识别有意义的模式(例如,新闻事件和新闻主题),同时通过递增地维持簇结构来减少计算量。此外,为了克服概念本体中主题关系的不足,我们提出了一个主题本体学习框架,该框架利用获得的文档层次结构。实验结果表明,提出的聚类算法产生了高质量的聚类,并且主题本体提供了新闻主题在不同抽象级别的解释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号