首页> 外文会议>International Conference on Internet and Distributed Computing Systems >A High Performance Modified K-Means Algorithm for Dynamic Data Clustering in Multi-core CPUs Based Environments
【24h】

A High Performance Modified K-Means Algorithm for Dynamic Data Clustering in Multi-core CPUs Based Environments

机译:基于多核CPU的动态数据聚类的高性能改进K均值算法

获取原文

摘要

K-means algorithm is one of the most widely used methods in data mining and statistical data analysis to partition several objects in K distinct groups, called clusters, on the basis of their similarities. The main problel and distributed clustering algorithms start to be designem of this algorithm is that it requires the number of clusters as an input data, but in the real life it is very difficult to fix in advance such value, in this work we propose a parallel modified K-means algorithm where the number of clusters is increased at run time in a iterative procedure until a given cluster quality metric is satisfied. To improve the performance of the procedure, at each iteration two new clusters are created, splitting only the cluster with the worst value of the quality metric. Furthermore, experiments in a multi-core CPUs based environment are presented.
机译:K均值算法是数据挖掘和统计数据分析中使用最广泛的方法之一,根据它们的相似性将K个不同的对象划分为K个不同的组,称为簇。问题和分布式聚类的主要算法开始被设计为该算法,因为它需要将聚类的数量作为输入数据,但是在现实生活中,很难预先确定该值,在这项工作中,我们提出了一个并行算法。改进的K均值算法,其中在运行时以迭代过程增加簇的数量,直到满足给定的簇质量度量。为了提高该过程的性能,每次迭代都会创建两个新的群集,仅拆分质量指标最差的群集。此外,提出了在基于多核CPU的环境中进行的实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号