首页> 外文会议>International Conference on Information Networking >The performance evaluation of k-means by two MapReduce frameworks, Hadoop vs. Twister
【24h】

The performance evaluation of k-means by two MapReduce frameworks, Hadoop vs. Twister

机译:Hadoop和Twister这两个MapReduce框架对k-means的性能评估

获取原文

摘要

In data mining, k-means is a method of cluster analysis using the nearest mean. It has been successfully used in various topics, ranging from market segmentation, computer vision, geostatistics, and astronomy to agriculture. But k-means like clustering is not easy to apply MapReduce model due to the iterative manner that can happen the stagger map tasks with high likelihood. This paper presents the result of performance evaluation of K-means application running on Twister and Hadoop framework. We report how to design a MapReduce application to organize the objects of dataset into k partitions. This approach provides the way to cluster a dataset by Hadoop, the MapReduce frameworks in a parallel manner.
机译:在数据挖掘中,k均值是使用最接近均值的聚类分析方法。它已成功用于各种主题,从市场细分,计算机视觉,地统计学,天文学到农业。但是,像k-means这样的聚类并不容易应用MapReduce模型,因为这种迭代方式很可能会发生交错地图任务。本文介绍了在Twister和Hadoop框架上运行的K-means应用程序的性能评估结果。我们报告了如何设计MapReduce应用程序以将数据集的对象组织到k个分区中。这种方法提供了通过Hadoop(MapReduce框架)以并行方式对数据集进行聚类的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号