首页> 外文期刊>Indian Journal of Computer Science and Engineering >DETERMINING THE NUMBER OF CLUSTERS FOR A K-MEANS CLUSTERING ALGORTIHM
【24h】

DETERMINING THE NUMBER OF CLUSTERS FOR A K-MEANS CLUSTERING ALGORTIHM

机译:确定K均值聚类算法的聚类数

获取原文
       

摘要

Clustering is a process used to divide data into a number of groups. All data points have some mathematical parameter according to which grouping can be done. For instance, if we have a number of points on a twodimensional grid, the x and y coordinates of the points are the parameters according to which clustering is done. If the k-means algorithm is run with k=3, the data points will be split into 3 groups such that the sum of the variance for each group is minimized. The problem here, of course, is the choice of the parameter k. We may get a much better modeling of the data if we split the data points into 2 or 4 groups. Determining the ?best? value of k is a broad problem ? there is no obvious parameter according to which this can be done. This paper looks at a new, efficient approach to determine the number of clusters.
机译:群集是用于将数据分为多个组的过程。所有数据点都有一些数学参数,根据这些数学参数可以进行分组。例如,如果我们在二维网格上有许多点,则这些点的x和y坐标是根据其完成聚类的参数。如果以k = 3运行k-means算法,则数据点将被分为3组,以使每组的方差之和最小。当然,这里的问题是参数k的选择。如果将数据点分为2组或4组,则可能会获得更好的数据建模。确定“最佳”? k的值是一个广泛的问题吗?没有明显的参数可以据此完成。本文着眼于一种确定簇数的高效新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号