首页> 外文会议>Brazilian Conference on Intelligent Systems >Scalable Fast Evolutionary k-Means Clustering
【24h】

Scalable Fast Evolutionary k-Means Clustering

机译:可扩展的快速进化k均值聚类

获取原文

摘要

The increasing amount of data requires greater scalability for clustering algorithms. The intrinsic parallelism of the MapReduce model confers management and reliability to large-scale distributed operations. However, its restrictions hinder the direct application of several traditional clustering algorithms. K-means is one of the few clustering algorithms that satisfy the MapReduce constraints, but it requires the prior specification of the number of clusters and is sensitive to their initialization. This paper proposes a MapReduce algorithm able to evolve clusters with no need to specify k-means' parameters. Through evolutive operators, obtained clusters are used to search for better solutions, allowing the algorithm to find quality solutions quickly. The algorithm is compared with state-of-the-art MapReduce versions of a systematic algorithm which is able to find the number of kmeans clusters and initializations. Computational experiments and statistical analyses of the results indicate that the proposed algorithm is able to obtain clusters with quality equal or superior to clusters of the compared algorithm, but faster.
机译:数据量的增加要求群集算法具有更大的可伸缩性。 MapReduce模型的固有并行性将管理和可靠性授予大规模分布式操作。但是,它的限制阻碍了几种传统聚类算法的直接应用。 K均值是满足MapReduce约束的为数不多的聚类算法之一,但它需要事先指定聚类的数量,并且对它们的初始化很敏感。本文提出了一种MapReduce算法,该算法无需指定k-means参数即可演化聚类。通过进化算子,获得的聚类用于搜索更好的解决方案,从而使算法能够快速找到优质的解决方案。该算法与系统算法的最新MapReduce版本进行了比较,后者能够找到kmeans簇的数量和初始化。计算实验和结果的统计分析表明,所提出的算法能够获得质量等于或优于比较算法的簇,但是速度更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号