...
首页> 外文期刊>International journal of parallel programming >A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks
【24h】

A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks

机译:用于内存中的大数据计算框架的任务感知细粒存储选择机制

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In-memory big data computing, widely used in hot areas such as deep learning and artificial intelligence, can meet the demands of ultra-low latency service and realtime data analysis. However, existing in-memory computing frameworks usually use memory in an aggressive way. Memory space is quickly exhausted and leads to great performance degradation or even task failure. On the other hand, the increasing volumes of raw data and intermediate data introduce huge memory demands, which further deteriorate the short of memory. To release the pressure on memory, those in-memory frameworks provide various storage schemes options for caching data, which determines where and how data is cached. But their storage scheme selection mechanisms are simple and insufficient, always manually set by users. Besides, those coarse-grained data storage mechanisms cannot satisfy memory access patterns of each computing unit which works on only part of the data. In this paper, we proposed a novel task-aware fine-grained storage scheme auto-selection mechanism. It automatically determines the storage scheme for caching each data block, which is the smallest unit during computing. The caching decision is made by considering the future tasks, real-time resource utilization, and storage costs, including block creation costs, I/O costs, and serialization costs under each storage scenario. The experiments show that our proposed mechanism, compared with the default storage setting, can offer great performance improvement, especially in memory-constrained circumstances it can be as much as 78%.
机译:内存大数据计算,广泛应用于深度学习和人工智能等热区,可以满足超低延迟服务的需求和实时数据分析。然而,现有的内存计算框架通常以激进的方式使用内存。内存空间迅速耗尽并导致性能下降甚至任务失败。另一方面,越来越多的原始数据和中间数据的卷引入了巨大的内存需求,这进一步恶化了内存的缺点。要释放内存压力,那些内存框架提供了用于缓存数据的各种存储方案选项,该选项可确定数据的高速缓存。但它们的存储方案选择机制简单且不足,始终由用户手动设置。此外,那些粗粒度的数据存储机制不能满足每个计算单元的存储器访问模式,其仅适用于数据的一部分。在本文中,我们提出了一种新的任务感知细粒储存方案自动选择机制。它自动确定用于缓存每个数据块的存储方案,该数据块是计算期间最小的单元。通过考虑未来的任务,实时资源利用率和存储成本,包括阻止创建成本,I / O成本以及在每个存储方案下的序列化成本来进行缓存决策。实验表明,我们的提出机制与默认存储设置相比,可以提供良好的性能改进,特别是在内存约束的情况下,它可以高达78%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号