首页> 外文会议>IEEE International Conference on Data Mining Workshops >A New Fast Minimum Spanning Tree-Based Clustering Technique
【24h】

A New Fast Minimum Spanning Tree-Based Clustering Technique

机译:一种新的基于最小生成树的快速最小化聚类技术

获取原文

摘要

Due to its important applications in data mining, many techniques have been developed for clustering. For today's real-world databases which typically have millions of items with many thousands of fields, resulting in datasets that range in size into terabytes, many traditional clustering techniques have more and more restricted capabilities and novel approaches that are computationally efficient have become more and more popular. In this paper, a new efficient approach to graph-theoretical clustering using a minimum spanning tree representation of a dataset is proposed which consists of two-phases. In the first phase, we modify the standard Prim's algorithm in such a way that an efficient construction of such a tree can be realized based on k-nearest neighbor search mechanisms, during which a new edge weight is defined to maximize the intra-cluster similarity and minimize the inter-cluster similarity of the data set. In the second phase, based on the intuition that the data points are closer in the same cluster than in different clusters, the longest edges in the minimum spanning tree obtained from the first phase are removed to form clusters as the standard minimum spanning tree-based clustering algorithms do. Experiments on synthetic as well as real data sets have been conducted to show that our proposed approach works well with respect to the state-of-the-art methods.
机译:由于其在数据挖掘中的重要应用,已经开发了许多用于聚类的技术。对于当今通常具有数百万个项目和数千个字段的现实世界数据库,导致数据集的大小范围达到TB,许多传统的聚类技术具有越来越受限制的功能,并且计算效率高的新颖方法变得越来越多受欢迎的。本文提出了一种新的有效的图论聚类方法,该方法使用数据集的最小生成树表示形式,该方法由两阶段组成。在第一阶段,我们修改标准的Prim算法,以便可以根据k个最近邻搜索机制实现这种树的有效构造,在此过程中定义了新的边缘权重以最大程度地提高集群内相似度并最小化数据集的集群间相似性。在第二阶段,基于直觉,同一群集中的数据点比不同群集中的数据点更直觉,从第一阶段获得的最小生成树中最长的边被删除,以形成基于标准最小生成树的群集聚类算法就可以了。已对合成数据集和实际数据集进行了实验,以表明我们提出的方法相对于最新方法而言效果很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号