首页> 外文期刊>Concurrency, practice and experience >Adaptive cache policy scheduling for big data applications on distributed tiered storage system
【24h】

Adaptive cache policy scheduling for big data applications on distributed tiered storage system

机译:分布式分层存储系统上大数据应用的自适应缓存策略调度

获取原文
获取原文并翻译 | 示例

摘要

Multitiered storage systems, which are made up of heterogeneous devices, are widely usedin distributed environments to accelerate the I/O performance of upper big data applications.It raises new challenges in efficient data migration through smart caching mechanisms amongheterogeneous storage levels, such asMEM-SSD-HDD. To optimize the cache policy schedulingmechanism on the distributed tiered storage architecture, we proposed a general frameworkwith five layers, including a tiered storage system layer, a cache migration policy layer, a cachepolicy adaptive scheduling layer, a data access pattern layer, and a big data application layer. Theframework prototype has been designed and implemented on the widely used distributed hybridstorage system named Alluxio. To meet the demands of the big data application layer, on the onehand, we designed a couple of cache eviction policies and promotion policies covering variousaccess patterns on the cache migration policy layer (several proposed eviction policies havebeen adopted by the Alluxio open-source community). On the other hand, two adaptive cachepolicy scheduling algorithms for selecting appropriate policies in various scenarios are designedand implemented on the cache policy adaptive scheduling layer. The scheduling algorithms aredesigned based on the hit ratio statistics and data access pattern model prediction, respectively.Experimental results show that the proposed cache policies are very effective for various bigdata applications, such as Spark SQL. The proposed cache policy scheduling algorithms withvarious eviction policies can improve around 20% hit ratio than that with a single eviction policy.
机译:由异构设备组成的多层存储系统被广泛使用在分布式环境中加速上大数据应用的I / O性能。通过智能缓存机制,它在高效数据迁移中提出了新的挑战异构存储水平,如菊-SSD-HDD。优化缓存策略调度在分布式分层存储架构上的机制,我们提出了一般框架有五个层,包括分层存储系统层,缓存迁移策略层,缓存策略自适应调度层,数据访问模式层和大数据应用层。这框架原型已经在广泛使用的分布式混合动力下设计和实现存储系统名为Alluxio。为了满足大数据应用层的需求,在一个手,我们设计了几个缓存驱逐政策和促销政策覆盖各种各样的缓存缓存迁移策略层上的访问模式(有几个提议的驱逐策略被Alluxio开源社区采用)。另一方面,两个自适应缓存设计用于在各种方案中选择适当的策略的策略调度算法并在缓存策略自适应调度层上实现。调度算法是根据命中比率统计和数据访问模式模型预测设计。实验结果表明,拟议的缓存政策对于各种大量来说非常有效数据应用程序,例如Spark SQL。建议的缓存策略调度算法各种驱逐政策可以提高大约20%的命中比率,比单一驱逐政策有限。

著录项

  • 来源
    《Concurrency, practice and experience》 |2019年第15期|e5138.1-e5138.25|共25页
  • 作者单位

    State Key Laboratory for Novel Software Technology Nanjing University Nanjing China Collaborative Innovation Center of NovelSoftware Technology and Industrialization Nanjing University Nanjing China;

    State Key Laboratory for Novel Software Technology Nanjing University Nanjing China Collaborative Innovation Center of NovelSoftware Technology and Industrialization Nanjing University Nanjing China;

    State Key Laboratory for Novel Software Technology Nanjing University Nanjing China Collaborative Innovation Center of NovelSoftware Technology and Industrialization Nanjing University Nanjing China;

    State Key Laboratory for Novel Software Technology Nanjing University Nanjing China Collaborative Innovation Center of NovelSoftware Technology and Industrialization Nanjing University Nanjing China;

    State Key Laboratory for Novel Software Technology Nanjing University Nanjing China Collaborative Innovation Center of NovelSoftware Technology and Industrialization Nanjing University Nanjing China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    adaptive scheduling; cache framework; eviction policy; promotion policy; tiered storage;

    机译:自适应调度;缓存框架;驱逐政策;促销政策;分层存储;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号