首页> 外文期刊>International Journal of Intelligent Systems and Applications >Efficient Data Clustering Algorithms: Improvements over Kmeans
【24h】

Efficient Data Clustering Algorithms: Improvements over Kmeans

机译:高效的数据聚类算法:Kmeans的改进

获取原文
           

摘要

This paper presents a new approach to overcome one of the most known disadvantages of the well-known Kmeans clustering algorithm. The problems of classical Kmeans are such as the problem of random initialization of prototypes and the requirement of predefined number of clusters in the dataset. Randomly initialized prototypes can often yield results to converge to local rather than global optimum. A better result of Kmeans may be obtained by running it many times to get satisfactory results. The proposed algorithms are based on a new novel definition of densities of data points which is based on the k-nearest neighbor method. By this definition we detect noise and outliers which affect Kmeans strongly, and obtained good initial prototypes from one run with automatic determination of K number of clusters. This algorithm is referred to as Efficient Initialization of Kmeans (EI-Kmeans). Still Kmeans algorithm used to cluster data with convex shapes, similar sizes, and densities. Thus we develop a new clustering algorithm called Efficient Data Clustering Algorithm (EDCA) that uses our new definition of densities of data points. The results show that the proposed algorithms improve the data clustering by Kmeans. EDCA is able to detect clusters with different non-convex shapes, different sizes and densities.
机译:本文提出了一种新方法,可以克服众所周知的Kmeans聚类算法的最著名缺点之一。经典Kmeans的问题包括原型的随机初始化问题和数据集中预定义簇数的要求。随机初始化的原型通常可以产生收敛到局部而不是全局最优的结果。通过多次运行,可以获得令人满意的结果,可以获得更好的Kmeans结果。所提出的算法基于基于k最近邻居方法的数据点密度的新的新颖定义。通过此定义,我们可以检测到对Kmeans有强烈影响的噪声和异常值,并通过自动确定K个聚类的一次运行获得了良好的初始原型。该算法称为Kmeans的有效初始化(EI-Kmeans)。 Still Kmeans算法用于对具有凸形,相似大小和密度的数据进行聚类。因此,我们使用我们对数据点密度的新定义,开发了一种称为高效数据聚类算法(EDCA)的新聚类算法。结果表明,所提算法提高了Kmeans对数据的聚类能力。 EDCA能够检测具有不同非凸形状,不同大小和密度的簇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号