【24h】

An Enhancement of K-means Clustering Algorithm

机译:K-means聚类算法的增强

获取原文
获取原文并翻译 | 示例

摘要

K-means clustering algorithm and one of its Enhancements are studied in this paper. Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait-often proximity according to some defined distance measure. A popular technique for clustering is based on K-means such that the data is partitioned into K clusters. In this method, the number of clusters is predefined and the technique is highly dependent on the initial identification of elements that represent the clusters well. If the numbers of sample data are too large, it may let the cluster members unstable. Another problem is selecting initial seed points because clustering results always depend on initial seed points and partitions. To prevent this problem, Refining initial points algorithm is provided, it can reduce execution time and improve solutions for large data by setting the refinement of initial conditions. The experiment results show that refining initial points algorithm is superior to K-means algorithm.
机译:本文研究了K均值聚类算法及其增强功能之一。聚类是将对象划分为不同的组,或更准确地说,是将数据集划分为子集(集群),以便每个子集中的数据(理想情况下)根据某些已定义的距离度量共享某些公共特征(通常是接近性)。一种流行的聚类技术是基于K均值的,以便将数据划分为K个聚类。在此方法中,群集的数量是预定义的,并且该技术高度依赖于很好地表示群集的元素的初始标识。如果样本数据的数量过多,可能会使群集成员不稳定。另一个问题是选择初始种子点,因为聚类结果始终取决于初始种子点和分区。为避免此问题,提供了细化初始点算法,它可以通过设置初始条件的细化来减少执行时间并改善大数据的解决方案。实验结果表明,改进的初始点算法优于K-means算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号