首页> 外文会议>International Conference on Parallel, Distributed and Grid Computing >Clonal Selection based Parallel Fuzzy Clustering using Map-reduce
【24h】

Clonal Selection based Parallel Fuzzy Clustering using Map-reduce

机译:基于克隆约简的Map-Reduce并行模糊聚类

获取原文

摘要

In recent decade, clustering is an important task in data mining in which data is grouped into different clusters. The clustering of data can be done using either crisp or fuzzy clustering algorithms. It has been seen that fuzzy based approaches are more reliable than crisp clustering techniques in terms of accuracy. But the random centroid initialization in fuzzy clustering leads solution to converge at local optima. Also the amount of data is endlessly increasing nowadays; hence to cluster the large volume of data is of major concern. In this paper, we consider both the above issues and proposed a parallel fuzzy clustering algorithm using clonal selection principle. The problem of convergence at local optima is resolved by using clonal selection principle and clustering of large datasets can be done using distributed map reduce framework. A scalable library mahout that runs on top of Hadoop is used for parallel fuzzy clustering. The experimental analysis is done on multi node Hadoop cluster and validated using different datasets.
机译:在最近的十年中,群集是数据挖掘中的一项重要任务,在该过程中,数据被分组到不同的群集中。数据的聚类可以使用明快或模糊聚类算法完成。已经看到,在准确性方面,基于模糊的方法比快速聚类技术更可靠。但是模糊聚类中的随机质心初始化导致解收敛于局部最优。如今,数据量也在不断增长。因此,对大量数据进行群集是主要关注的问题。在本文中,我们考虑了以上两个问题,并提出了一种基于克隆选择原理的并行模糊聚类算法。通过使用克隆选择原则解决了局部最优收敛的问题,并且可以使用分布式映射约简框架完成大型数据集的聚类。运行在Hadoop之上的可扩展库mahout用于并行模糊集群。实验分析在多节点Hadoop集群上完成,并使用不同的数据集进行了验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号