...
首页> 外文期刊>Tsinghua Science and Technology >Efficient location-aware data placement for data-intensive applications in geo-distributed scientific data centers
【24h】

Efficient location-aware data placement for data-intensive applications in geo-distributed scientific data centers

机译:地理分布科学数据中心中用于数据密集型应用程序的高效位置感知数据放置

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. Therefore, the optimization of data access cost in the placement of data items in globally distributed scientific data centers has become an increasingly important goal. Existing data placement approaches for geo-distributed data items are insufficient because they either cannot cope with the cost incurred by the associated data access, or they overlook storage capacity limitations, which are a very practical constraint of scientific data centers. In this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time (NP)-hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduces overall data access cost.
机译:云计算和大数据的最新发展催生了数据密集型应用程序的出现,这些应用程序将大量的科学数据集存储在全球分布的科学数据中心中,全世界科学家对这些数据集的访问频率很高。可能需要将分布在不同科学数据中心中的多个关联数据项用于一项数据处理任务,并且数据放置决策必须遵守科学数据中心的存储容量限制。因此,在全球分布的科学数据中心中放置数据项时优化数据访问成本已成为越来越重要的目标。现有的地理分布数据项数据放置方法不足,因为它们要么无法应付关联数据访问带来的成本,要么就忽略了存储容量限制,而这是科学数据中心的一个非常实际的约束。在本文中,受高能物理领域应用的启发,我们提出了一种基于整数编程的数据放置模型,该模型将上述挑战作为非确定性多项式时间(NP)难题解决。另外,我们使用基于拉格朗日松弛的启发式算法来获得理想的数据放置解决方案。仿真结果表明,我们的算法是有效的,可显着降低总体数据访问成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号