首页> 外文会议>2012 IEEE International Conference on Cluster Computing Workshops. >vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration
【24h】

vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration

机译:vHadoop:具有基于性能考虑的基于MapReduce的并行机器学习的可扩展Hadoop虚拟集群平台

获取原文
获取原文并翻译 | 示例

摘要

Big data processing is currently becoming increasingly important in modern era due to the continuous growth of the amount of data generated by various fields such as particle physics, human genomics, earth observation, etc. However, the efficiency of processing large-scale data on modern virtual infrastructure, especially on the virtualized cloud computing infrastructure, is not clear. This paper focuses on the performance of hadoop virtual cluster and proposes a scalable hadoop virtual cluster platform vHadoop for the large-scale MapReduce-based parallel data processing. We first describe the design and implementation of vHadoop platform. Then we perform a series of experiments to investigate both the static and dynamic performance of vHadoop platform, such as the performance characterization of cross-domain hadoop virtual cluster and live migraiton of hadoop virtual cluster. After that, we use the vHadoop platform to process 6 typical parallel clustering algorithms, such as Canopy, Dirichlet, Fuzzy k-Means, k-Means, Mean Shift, MinHash, etc, on two typical datasets. Experimental results verify the efficiency of vHadoop platform to process the MapReduce-based parallel machine learning applications.
机译:由于各个领域(例如粒子物理学,人类基因组学,地球观测等)产生的数据量的持续增长,大数据处理在现代正变得越来越重要。但是,在现代环境下处理大规模数据的效率虚拟基础架构,特别是在虚拟化云计算基础架构上,尚不清楚。本文重点介绍了hadoop虚拟集群的性能,并提出了可扩展的hadoop虚拟集群平台vHadoop,用于基于MapReduce的大规模并行数据处理。我们首先描述vHadoop平台的设计和实现。然后,我们进行了一系列实验来研究vHadoop平台的静态和动态性能,例如跨域hadoop虚拟集群的性能表征和hadoop虚拟集群的实时迁移。之后,我们使用vHadoop平台在两个典型数据集上处理6种典型的并行聚类算法,例如Canopy,Dirichlet,Fuzzy k-Means,k-Means,Mean Shift,MinHash等。实验结果证明了vHadoop平台处理基于MapReduce的并行机器学习应用程序的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号