首页> 外文期刊>Future generation computer systems >Key based data analytics across data centers considering bi-level resource provision in cloud computing
【24h】

Key based data analytics across data centers considering bi-level resource provision in cloud computing

机译:考虑云计算中两级资源供应的跨数据中心的基于密钥的数据分析

获取原文
获取原文并翻译 | 示例
           

摘要

Due to the distribution characteristic of the data source, such as astronomy and sales, or the legal prohibition, it is not always practical to store the world-wide data in only one data center (DC). Hadoop is a commonly accepted framework for big data analytics. But it can only deal with data within one DC. The distribution of data necessitates the study of Hadoop across DCs. In this situation, though, we can place mappers in the local DCs, where to place reducers is a great challenge, since each reducer needs to process almost all map output across all involved DCs. In this paper, a novel architecture and a key based scheme are proposed which can respect the locality principle of traditional Hadoop as much as possible while realizing deployment of reducers with lower costs. Considering both the DC level and the server level resource provision, bi-level programming is used to formalize the problem and it is solved by a tailored two level group genetic algorithm (TLGGA). The final results, which may be dispersed in several DCs, can be aggregated to a designative DC or the DC with the minimum transfer and storage cost. Extensive simulations demonstrate the effectiveness of TLGGA. It can outperform both the baseline and the state-of-the-art mechanisms by 49% and 40%, respectively.
机译:由于数据源的分布特性(例如天文学和销售)或法律禁止,将世界范围的数据仅存储在一个数据中心(DC)中并不总是可行的。 Hadoop是大数据分析的公认框架。但是它只能处理一个DC内的数据。数据的分布需要跨DC研究Hadoop。但是,在这种情况下,我们可以将映射器放置在本地DC中,在哪里放置约化器是一个巨大的挑战,因为每个约化器都需要处理所有涉及到的DC的几乎所有地图输出。本文提出了一种新颖的体系结构和基于密钥的方案,该方案可以在尽可能低成本地实现减速器部署的同时,尽可能地尊重传统Hadoop的局部性原则。考虑到DC级别和服务器级别的资源供应,使用双层编程将问题形式化,并通过量身定制的两级组遗传算法(TLGGA)进行解决。可以分散在多个DC中的最终结果可以汇总到指定DC或具有最小传输和存储成本的DC中。大量的仿真证明了TLGGA的有效性。它可以分别比基准机制和最新机制高49%和40%。

著录项

  • 来源
    《Future generation computer systems》 |2016年第9期|40-50|共11页
  • 作者单位

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ,Public Service Platform of Mobile Internet Application Security Industry, Shenzhen 518057, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ,Shenzhen Key Laboratory of Internet of Information Collaboration, Shenzhen 518055, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ,Shenzhen Key Laboratory of Internet of Information Collaboration, Shenzhen 518055, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ,Shenzhen Applied Technology Engineering Laboratory for Internet Multimedia Application, Shenzhen 518055, China;

    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ,Shenzhen Applied Technology Engineering Laboratory for Internet Multimedia Application, Shenzhen 518055, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Reducer placement; Resource provision; Hadoop across data centers; Distributed cloud;

    机译:减速器的位置;资源提供;跨数据中心的Hadoop;分布式云;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号