A Big Data Placement Strategy in Geographically Distributed Datacenters

机译：地理分布式数据中心的大数据放置策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the pervasiveness of the "Big Data" characteristic together with the expansion of geographically distributed datacenters in the Cloud computing context, processing large- scale data applications has become a crucial issue. Indeed, the task of finding the most efficient way of storing massive data across distributed locations is increasingly complex. Furthermore, the execution time of a given task that requires several datasets might be dominated by the cost of data migrations/exchanges, which depends on the initial placement of the input datasets over the set of datacenters in the Cloud and also on the dynamic data management strategy. In this paper, we propose a data placement strategy to improve the workflow execution time through the reduction of the cost associated to data movements between geographically distributed datacenters, considering their characteristics such as storage capacity and read/write speeds. We formalize the overall problem and then propose a data placement algorithm structured into two phases. First, we compute the estimated transfer time to move all involved datasets from their respective locations to the one where the corresponding tasks are executed. Second, we apply a greedy algorithm in order to assign each dataset to the optimal datacenter w.r.t the overall cost of data migrations. The heterogeneity of the datacenters together with their characteristics (storage and bandwidth) are both taken into account. Our experiments are conducted using Cloudsim simulator. The obtained results show that our proposed strategy produces an efficient placement and actually reduces the overheads of the data movement compared to both a random assignment and a selected placement algorithm from the literature.

机译：随着“大数据”的特性一起与云中地理分布的数据中心计算背景下，处理大规模数据应用的扩大普及已经成为一个至关重要的问题。事实上，发现在分布式存储地点海量数据的最快捷方式的任务越来越复杂。此外，需要几个数据集给定任务的执行时间可能通过数据迁移/交换的成本，这在集云数据中心的依赖于输入数据集的初始放置，并在数据的动态管理为主战略。在本文中，我们提出了一个数据放置策略，以提高通过关联到地理上分布的数据中心之间的数据移动的成本降低工作流程的执行时间，考虑到它们的特性，如存储容量和读取/写入速度。我们正式确定整体问题，然后提出结构分为两个阶段数据布局算法。首先，我们计算估计的传递时间从到相应的任务执行一个各自的位置移动所有涉及的数据集。其次，我们以每个数据集分配到w.r.t数据迁移的总体成本最优的数据中心应用贪心算法。在数据中心与它们的特性（存储和带宽）一起的异质性均考虑在内。我们的实验使用Cloudsim模拟器进行。所得到的结果表明，该策略产生一个有效放置和实际上降低相比，无论随机分配和从文献中所选择的布局算法的数据移动的开销。

著录项

来源
《International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications》|2020年|1-9|共9页
会议地点
作者
Laila Bouhouch; Mostapha Zbakh; Claude Tadonki;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Greedy algorithms; Cloud computing; Distributed databases; Bandwidth; Big Data applications; Task analysis;

机译：贪婪算法;云计算;分布式数据库;带宽;大数据应用;任务分析;

相似文献

外文文献
中文文献
专利

1. An effective scheduling strategy based on hypergraph partition in geographically distributed datacenters [J] . Computer networks . 2020,第Apra7期

机译：地理分布数据中心中基于超图分区的有效调度策略
2. A Framework of Hypergraph-Based Data Placement Among Geo-Distributed Datacenters [J] . Boyang Yu, Jianping Pan Services Computing, IEEE Transactions on . 2020,第3期

机译：地理分布式数据中心之间的超图数据放置框架
3. FollowMe@LS: Electricity price and source aware resource management in geographically distributed heterogeneous datacenters [J] . Hashim Ali, Muhammad Zakarya, Izaz Ur Rahman, The Journal of Systems and Software . 2021,第May期

机译：跟随@ LS：地理分布式异构数据中心的电力价格和源简读资源管理
4. Sketch-based data placement among geo-distributed datacenters for cloud storages [C] . Boyang Yu, Jianping Pan IEEE 35th Annual IEEE International Conference on Computer Communications . 2016

机译：基于草图的数据放置在用于云存储的地理分布数据中心之间
5. Optimizing Big Data Analytics Frameworks in Geographically Distributed Datacenters [D] . Liu, Shuhao. 2019

机译：优化地理分布数据中心的大数据分析框架
6. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform [O] . Yu-Shuang Dong, Gao-Chao Xu, Xiao-Dong Fu -1

机译：云平台上虚拟机部署的分布式并行遗传算法
7. Heterogeneity-aware Workload Placement and Migration in Distributed Sustainable Datacenters [O] . Dazhao Cheng, Changjun Jiang, Xiaobo Zhou 2015

机译：分布式可持续数据中心的异构性意识工作量放置和迁移

A Big Data Placement Strategy in Geographically Distributed Datacenters

摘要

著录项

相似文献

相关主题

期刊订阅