【24h】

Cost-Efficient Task Scheduling for Geo-distributed Data Analytics

机译:地理分布数据分析的经济高效的任务计划

获取原文

摘要

Geo-distributed data processing is affected by many factors, some countries or regions prohibit the transmission of original user data abroad. Therefore, it is necessary to adopt a non-centralized processing method for these data, but at the same time, many problems will arise. Firstly, it is unavoidable to transfer job's intermediate data across regions, which will result in data transmission cost. Secondly, the WAN bandwidth is often much smaller than the bandwidth within clusters, which makes it easier to become the bottleneck of geo-distributed job. In addition, because the idle computing resources in the cluster may change with time, it will also cause some difficulties in task scheduling. Therefore, this paper considers the problem of task scheduling for big data jobs on geo-distributed data, considering the budget constraints on intermediate data trans-regional transmission, and without moving the original data, we design a budget-constrained task scheduling strategy CETS. Through the experimental analysis of different scenarios, the effectiveness of the proposed algorithm strategy is verified.
机译:地理分布数据处理受到许多因素的影响,一些国家或地区禁止将原始用户数据传输到国外。因此,有必要对这些数据采用非集中式处理方法,但是同时会出现许多问题。首先,不可避免的是跨区域传输作业的中间数据,这将导致数据传输成本。其次,WAN带宽通常比群集中的带宽小得多,这使它更容易成为地理分布作业的瓶颈。另外,由于群集中的空闲计算资源可能会随时间变化,因此也会在任务调度中造成一些困难。因此,本文考虑了地理数据上大数据作业的任务调度问题,并考虑了中间数据跨区域传输的预算约束,并且在不移动原始数据的情况下,设计了预算受限的任务调度策略CETS。通过对不同场景的实验分析,验证了所提算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号