首页> 外文会议>International conference on emerging trends in information technology >An Empirical Evaluation of K-Means Clustering Algorithm Using Different Distance/Similarity Metrics
【24h】

An Empirical Evaluation of K-Means Clustering Algorithm Using Different Distance/Similarity Metrics

机译:使用不同距离/相似度量的K-Means聚类算法的实证评估

获取原文

摘要

k-means is an effective and efficient clustering algorithm. It uses distance/similarity metric to find out the distance/similarity among the data objects. The objects which are closer/similar to each other are assigned to the same cluster where as distant/dissimilar objects are assigned to different clusters. Most of the implementations of k-means are based on Euclidean/Squared Euclidean distance metrics. In order to find out the possibility of different distance/similarity metrics to be used with k-means algorithm, an empirical evaluation has been performed. In this paper, accuracy, performance and reliability of 13 different distance/similarity measures over 6 different variations of data using k-means algorithm have been compared based on empirical evaluation on well-known benchmark IRIS data set. Accuracy is measured in terms of similarity of cluster assignment between ground truth and machine clustering. Performance is measured in terms of the number of iterations used for convergence of the final cluster assignment. Reliability is measured on the basis of correctness of the cluster assignment.
机译:K-means是一种有效且有效的聚类算法。它使用距离/相似度度量来了解数据对象之间的距离/相似性。彼此靠近/相似的对象被分配给与遥控器/不同对象分配给不同群集的相同的集群。 K-Means的大多数实现基于欧几里德/平方欧几里德距离指标。为了找出要与K-Means算法一起使用的不同距离/相似度的可能性,已经执行了经验评估。本文基于众所周知的基准IRIS数据集的经验评估,比较了13个不同距离/相似性测量的精度,性能和可靠性,通过k-mean算法的实证评估进行了比较。在地面真理和机器聚类之间的集群分配的相似性方面测量精度。性能是以用于最终集群分配的收敛的迭代的数量来衡量。基于集群分配的正确性来测量可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号