首页> 外文期刊>BMC Bioinformatics >A practical comparison of two K-Means clustering algorithms
【24h】

A practical comparison of two K-Means clustering algorithms

机译:两种K-Means聚类算法的实际比较

获取原文
           

摘要

Background Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. Results In this paper, we study two implementations of a general method for data clustering: k -means clustering. Our experimentation compares the running times and distance efficiency of Lloyd's K -means Clustering and the Progressive Greedy K -means Clustering. Conclusion Based on our implementation, not just in processing time, but also in terms of mean squared-difference (MSD), Lloyd's K -means Clustering algorithm is more efficient. This analysis was performed using both a gene expression level sample and on randomly-generated datasets in three-dimensional space. However, other circumstances may dictate a different choice in some situations.
机译:背景数据聚类是一种用于识别具有相似特征的数据(例如具有相似表达模式的基因)的强大技术。但是,并不是所有的聚类算法实现都能产生相同的性能或相同的聚类。结果本文研究了一种通用的数据聚类方法的实现:k-均值聚类。我们的实验比较了劳埃德K均值聚类和渐进贪婪K均值聚类的运行时间和距离效率。结论基于我们的实现,不仅在处理时间上,而且在均方差(MSD)方面,劳埃德K均值聚类算法都更加有效。使用基因表达水平样本和三维空间中随机生成的数据集进行了此分析。但是,在某些情况下,其他情况可能会指示不同的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号