Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

首页> 外文期刊>Concurrency, practice and experience >Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

【24h】

Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

机译：改进异构环境中异构工作负载的MapReduce调度程序

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Big data is largely influencing business entities and research sectors to be more data-driven. Hadoop MapReduce is one of the cost-effective ways to process large scale datasets and offered as a service over the Internet. Even though cloud service providers promise an infinite amount of resources available on-demand, it is inevitable that some of the hired virtual resources for MapReduce are left unutilized and makespan is limited due to various heterogeneities that exist while offering MapReduce as a service. As MapReduce v2 allows users to define the size of containers for the map and reduce tasks, jobs in a batch become heterogeneous and behave differently. Also, the different capacity of virtual machines in the MapReduce virtual cluster accommodate a varying number of map/reduce tasks. These factors highly affect resource utilization in the virtual cluster and the makespan for a batch of MapReduce jobs. Default MapReduce job schedulers do not consider these heterogeneities that exist in a cloud environment. Moreover, virtual machines in MapReduce virtual cluster process an equal number of blocks regardless of their capacity, which affects the makespan. Therefore, we devised a heuristic-based MapReduce job scheduler that exploits virtual machine and MapReduce workload level heterogeneities to improve resource utilization and makespan. We proposed two methods to achieve this: (i) roulette wheel scheme based data block placement in heterogeneous virtual machines, and (ii) a constrained 2-dimensional bin packing to place heterogeneous map/reduce tasks. We compared heuristic-based MapReduce job scheduler against the classical fair scheduler in MapReduce v2. Experimental results showed that our proposed scheduler improved makespan and resource utilization by 45.6% and 47.9% over classical fair scheduler.

机译：大数据正在很大程度上影响业务实体和研究部门，使其更受数据驱动。 Hadoop MapReduce是处理大规模数据集的一种经济有效的方法，并通过Internet作为服务提供。即使云服务提供商承诺按需提供无限量的资源，但不可避免的是，由于在提供MapReduce服务时存在各种异质性，因此MapReduce的某些租用虚拟资源仍未得到利用，而makepan受到限制。由于MapReduce v2允许用户定义地图的容器大小并减少任务，因此批处理中的作业变得异构并且表现不同。此外，MapReduce虚拟群集中虚拟机的不同容量可容纳数量不等的映射/还原任务。这些因素严重影响虚拟群集中的资源利用率以及一批MapReduce作业的有效期。默认的MapReduce作业调度程序不考虑云环境中存在的这些异构性。此外，MapReduce虚拟集群中的虚拟机无论处理多少块，都处理相等数量的块，这会影响有效期。因此，我们设计了一种基于启发式的MapReduce作业调度程序，该调度程序利用虚拟机和MapReduce工作负载级别的异构性来提高资源利用率和有效期。我们提出了两种方法来实现此目的：（i）基于轮盘方案的数据块在异构虚拟机中的放置，以及（ii）受约束的二维装箱以放置异构地图/约简任务。我们将基于启发式的MapReduce作业计划程序与MapReduce v2中的经典公平计划程序进行了比较。实验结果表明，与传统的公平调度程序相比，我们提出的调度程序将制造时间和资源利用率提高了45.6％和47.9％。

著录项

来源
《Concurrency, practice and experience》 |2020年第7期|e5558.1-e5558.10|共10页
作者

展开▼
作者单位

Natl Inst Technol Karnataka Dept IT Mangalore Karnataka India;

Kyungpook Natl Univ Sch Comp Sci & Engn 80 Daehakro Daegu 702701 South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
bin packing; heterogeneous workloads; jobs; map; reduce task placement;

机译：垃圾箱包装;异构工作量;工作;地图;减少任务放置;

相似文献

外文文献
中文文献
专利

1. Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment [J] . Jeyaraj Rathinaraja, Ananthanarayana V. S., Paul Anand Concurrency, practice and experience . 2020,第17期

机译：改进MapReduce调度程序在异构环境中的异构工作负载
2. MapReduce Scheduler Using Classifiers for Heterogeneous Workloads [J] . Visalakshi P, Karthik TU International journal of computer science and network security . 2011,第4期

机译：MapReduce Scheduler使用分类器处理异构工作负载
3. MapReduce Scheduler Using Classifiers for Heterogeneous Workloads [J] . Visalakshi Pt, Karthik TU International journal of computer science and network security . 2011,第4期

机译：使用分类器处理异构工作负载的MapReduce Scheduler
4. A usage-aware scheduler for improving MapReduce performance in heterogeneous environments [C] . Hsiao J.H., Kao S.J. International Conference on Information Science, Electronics and Electrical Engineering . 2014

机译：使用意识的调度程序，用于提高异构环境中的MapReduce性能
5. Performance, Energy and Temperature Considerations for Job Scheduling and for Workload Distribution in Heterogeneous Systems. [D] . Alsubaihi, Shouq. 2017

机译：异构系统中的作业调度和工作负荷分配的性能，能量和温度注意事项。
6. A Pruning-Based Disk Scheduling Algorithm for Heterogeneous I/O Workloads [O] . Taeseok Kim, Hyokyung Bahn, Youjip Won -1

机译：基于修剪的异构I / O工作负载磁盘调度算法
7. TMaR: a two-stage MapReduce scheduler for heterogeneous environments [O] . Neda Maleki, Hamid Reza Faragardi, Amir Masoud Rahmani, 2020

机译：TMAR：异构环境的两级MapReduce调度程序

Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅