Abstract'/> Storage tier-aware replicative data reorganization with prioritization for efficient workload processing
首页> 外文期刊>Future generation computer systems >Storage tier-aware replicative data reorganization with prioritization for efficient workload processing
【24h】

Storage tier-aware replicative data reorganization with prioritization for efficient workload processing

机译:具有优先级的存储层感知复制数据重组,可进行有效的工作负载处理

获取原文
获取原文并翻译 | 示例
       

摘要

AbstractThe importance of data collection, processing, and analysis is rapidly growing. Big Data technologies are in high demand in many fields, including bio-informatics, hydrometeorology, and high energy physics. One of the most popular computational paradigms used in large data processing frameworks is the MapReduce programming model. Today, majority of integrated optimization mechanisms that quickly produce simple solutions typically consider only load balancing, which is not sufficient for advanced computations. Thus, more efficient and complex approaches are required. In this paper, we suggest an improved algorithm based on categories for reorganizing data in MapReduce frameworks and using replication as well as network transfer. Moreover, we introduce an algorithm customization for urgent computations which require specific approaches in terms of execution time and reliability. We also consider modern data storage aspects, like the ability to work with data on different “layers” (HDD, SSD, and RAM), which can greatly improve the overall performance of our solution.HighlightsA conceptual overview of basic criteria of data placement optimization problem was made.A heuristic CRUSH algorithm was adopted to the problem conditions and compared with developed own Greedy Algorithm.Tiering Greedy Algorithm was developed to optimize file location within storage tiering (HDD, SDD, RAM) and was compared with other algorithms.Categorical Genetic Algorithm (CGA) was developed on base of previously developed Genetic Algorithm for data reorganizationTiering Categorical Genetic Algorithm was developed and experimentally studied (including parameter sensitivity analysis).
机译: 摘要 数据收集的重要性,处理和分析正在迅速增长。在许多领域,包括生物信息学,水文气象学和高能物理学,对大数据技术的需求都很高。大型数据处理框架中使用的最流行的计算范例之一是MapReduce编程模型。如今,大多数快速产生简单解决方案的集成优化机制通常只考虑负载平衡,这不足以进行高级计算。因此,需要更有效和复杂的方法。在本文中,我们提出了一种基于类别的改进算法,用于在MapReduce框架中重组数据并使用复制以及网络传输。此外,我们针对紧急计算引入了一种算法定制,该算法需要在执行时间和可靠性方面需要特定的方法。我们还考虑了现代数据存储方面的问题,例如在不同的“层”(HDD,SSD和RAM)上使用数据的能力,它们可以极大地提高我们解决方案的整体性能。 突出显示 对数据放置优化问题的基本标准进行了概念性概述。 对问题条件采用启发式CRUSH算法并进行了比较与自己开发的贪婪算法Ithm。 “分层贪婪”算法的开发是为了优化存储分层(HDD,SDD,RAM)中的文件位置,并与其他算法进行了比较。 分类遗传算法( CGA)是在先前开发的用于数据重组的遗传算法的基础上开发的。 开发了分层分类遗传算法并进行了实验研究(包括参数敏感性分析)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号