首页> 外文会议>International Conference on High Performance Computing and Applications >Scalable parallel clustering approach for large data using parallel K means and firefly algorithms
【24h】

Scalable parallel clustering approach for large data using parallel K means and firefly algorithms

机译:使用并行K均值和萤火虫算法的大数据可扩展并行聚类方法

获取原文

摘要

This paper mainly focuses in identifying the limitations of the k means algorithm and to propose the parallelization of the k-means using firefly based clustering method. The new parallel architecture can handle large number of clusters. Firefly algorithm to find initial optimal cluster centroid and then k-means algorithm with optimized centroid to refined them and improve clustering accuracy. The final convergence issue is also addressed and solved to a great extent. Finally modified algorithm is compared with parallel k means is demonstrated with experiments and it has been found that the performance of modified algorithm is better than the existing algorithm. Four typical benchmark data sets from the UCI machine learning repository are used to demonstrate the results of the techniques. To achieve this we can use fork/join method in java programming. It is the most effective design method for achieve good parallel performance.
机译:本文主要着眼于确定k均值算法的局限性,并提出使用基于萤火虫的聚类方法对k均值进行并行化。新的并行体系结构可以处理大量集群。 Firefly算法先找到初始的最佳聚类质心,然后使用具有优化质心的k-means算法精炼它们并提高聚类精度。最后的收敛问题也得到了很大程度的解决。最后通过实验验证了改进算法与并行k均值的比较,发现改进算法的性能优于现有算法。使用UCI机器学习存储库中的四个典型基准数据集来演示该技术的结果。为此,我们可以在Java编程中使用fork / join方法。这是实现良好并行性能的最有效设计方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号