首页> 外文期刊>Tsinghua Science and Technology >Load feedback-based resource scheduling and dynamic migration-based data locality for virtual hadoop clusters in openstack-based clouds
【24h】

Load feedback-based resource scheduling and dynamic migration-based data locality for virtual hadoop clusters in openstack-based clouds

机译:基于Openstack的云中的虚拟hadoop集群的基于负载反馈的资源调度和基于动态迁移的数据局部性

获取原文
获取原文并翻译 | 示例
           

摘要

With cloud computing technology becoming more mature, it is essential to combine the big data processing tool Hadoop with the Infrastructure as a Service (IaaS) cloud platform. In this study, we first propose a new Dynamic Hadoop Cluster on IaaS (DHCI) architecture, which includes four key modules: monitoring, scheduling, Virtual Machine (VM) management, and VM migration modules. The load of both physical hosts and VMs is collected by the monitoring module and can be used to design resource scheduling and data locality solutions. Second, we present a simple load feedback-based resource scheduling scheme. The resource allocation can be avoided on overburdened physical hosts or the strong scalability of virtual cluster can be achieved by fluctuating the number of VMs. To improve the flexibility, we adopt the separated deployment of the computation and storage VMs in the DHCI architecture, which negatively impacts the data locality. Third, we reuse the method of VM migration and propose a dynamic migration-based data locality scheme using parallel computing entropy. We migrate the computation nodes to different host(s) or rack(s) where the corresponding storage nodes are deployed to satisfy the requirement of data locality. We evaluate our solutions in a realistic scenario based on OpenStack. Substantial experimental results demonstrate the effectiveness of our solutions that contribute to balance the workload and performance improvement, even under heavy-loaded cloud system conditions.
机译:随着云计算技术的日趋成熟,将大数据处理工具Hadoop与基础设施即服务(IaaS)云平台相结合至关重要。在本研究中,我们首先提出一种新的基于IaaS的动态Hadoop集群(DHCI)架构,其中包括四个关键模块:监视,调度,虚拟机(VM)管理和VM迁移模块。物理主机和VM的负载均由监视模块收集,可用于设计资源调度和数据局部性解决方案。其次,我们提出一种基于负载反馈的简单资源调度方案。可以避免在负担过重的物理主机上进行资源分配,或者可以通过波动VM数量来实现虚拟集群的强大可伸缩性。为了提高灵活性,我们在DHCI体系结构中采用了计算和存储VM的单独部署,这会对数据局部性产生负面影响。第三,我们重用了VM迁移的方法,并提出了一种使用并行计算熵的基于动态迁移的数据局部化方案。我们将计算节点迁移到不同的主机或机架,在该主机或机架中部署了相应的存储节点,以满足数据局部性的要求。我们在基于OpenStack的实际场景中评估我们的解决方案。大量的实验结果表明,即使在负载很重的云系统条件下,我们的解决方案也能有效平衡工作负载和提高性能。

著录项

  • 来源
    《Tsinghua Science and Technology》 |2017年第2期|149-159|共11页
  • 作者单位

    School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, and Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing 210003, China;

    Network and Information Center, Institute of Network Technology, Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory, National Engineering Laboratory for Mobile Network Security, Beijing University of Posts and Telecommunications, Beijing 100876, China;

    School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, and Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing 210003, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Dynamic scheduling; Cloud computing; Computer architecture; Monitoring; Processor scheduling; Virtual machining; Resource management;

    机译:动态调度;云计算;计算机体系结构;监控;处理器调度;虚拟加工;资源管理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号