A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks

Bo Wang; Jie Tang; Rui Zhang; Jialei Liu; Shaoshan Liu; Deyu Qi

首页> 外文期刊>International journal of parallel programming >A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks

【24h】

A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks

机译：用于内存中的大数据计算框架的任务感知细粒存储选择机制

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In-memory big data computing, widely used in hot areas such as deep learning and artificial intelligence, can meet the demands of ultra-low latency service and realtime data analysis. However, existing in-memory computing frameworks usually use memory in an aggressive way. Memory space is quickly exhausted and leads to great performance degradation or even task failure. On the other hand, the increasing volumes of raw data and intermediate data introduce huge memory demands, which further deteriorate the short of memory. To release the pressure on memory, those in-memory frameworks provide various storage schemes options for caching data, which determines where and how data is cached. But their storage scheme selection mechanisms are simple and insufficient, always manually set by users. Besides, those coarse-grained data storage mechanisms cannot satisfy memory access patterns of each computing unit which works on only part of the data. In this paper, we proposed a novel task-aware fine-grained storage scheme auto-selection mechanism. It automatically determines the storage scheme for caching each data block, which is the smallest unit during computing. The caching decision is made by considering the future tasks, real-time resource utilization, and storage costs, including block creation costs, I/O costs, and serialization costs under each storage scenario. The experiments show that our proposed mechanism, compared with the default storage setting, can offer great performance improvement, especially in memory-constrained circumstances it can be as much as 78%.

机译：内存大数据计算，广泛应用于深度学习和人工智能等热区，可以满足超低延迟服务的需求和实时数据分析。然而，现有的内存计算框架通常以激进的方式使用内存。内存空间迅速耗尽并导致性能下降甚至任务失败。另一方面，越来越多的原始数据和中间数据的卷引入了巨大的内存需求，这进一步恶化了内存的缺点。要释放内存压力，那些内存框架提供了用于缓存数据的各种存储方案选项，该选项可确定数据的高速缓存。但它们的存储方案选择机制简单且不足，始终由用户手动设置。此外，那些粗粒度的数据存储机制不能满足每个计算单元的存储器访问模式，其仅适用于数据的一部分。在本文中，我们提出了一种新的任务感知细粒储存方案自动选择机制。它自动确定用于缓存每个数据块的存储方案，该数据块是计算期间最小的单元。通过考虑未来的任务，实时资源利用率和存储成本，包括阻止创建成本，I / O成本以及在每个存储方案下的序列化成本来进行缓存决策。实验表明，我们的提出机制与默认存储设置相比，可以提供良好的性能改进，特别是在内存约束的情况下，它可以高达78％。

著录项

来源
《International journal of parallel programming》 |2021年第1期|25-50|共26页
作者
Bo Wang; Jie Tang; Rui Zhang; Jialei Liu; Shaoshan Liu; Deyu Qi;
展开▼
作者单位

Anyang Normal University Anyang China;

South China University of Technology Guangzhou China;

Yan'an University Yan'an China;

Anyang Normal University Anyang China;

Perceptln Shenzhen China;

South China University of Technology Guangzhou China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Big data; In-memory computing; Storage scheme; Performance optimization;

机译：大数据;内存计算;存储方案;性能优化;

相似文献

外文文献
中文文献
专利

1. A Dependency-Aware Storage Schema Selection Mechanism for In-Memory Big Data Computing Frameworks [J] . Wang Bo, Tang Jie, Zhang Rui, International journal of parallel programming . 2019,第3期

机译：内存中大数据计算框架的依赖关系存储模式选择机制
2. Data Prefetching and Eviction Mechanisms of In-Memory Storage Systems Based on Scheduling for Big Data Processing [J] . Chen Chien-Hung, Hsia Ting-Yuan, Huang Yennun, IEEE Transactions on Parallel and Distributed Systems . 2019,第8期

机译：基于调度的大数据处理内存存储系统数据预取与收回机制
3. Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams [J] . Kumar Vivek, Sharma Dilip Kumar, Mishra Vinay Kumar Journal of supercomputing . 2021,第7期

机译：Mille Cheval：基于GPU的内存高性能计算框架，用于加速处理大数据流
4. Dynamic Management of In-Memory Storage for Efficiently Integrating Compute-and Data-Intensive Computing on HPC Systems [C] . Pengfei Xuan, Feng Luo, Rong Ge, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2017

机译：内存中存储的动态管理，可在HPC系统上有效地集成计算和数据密集型计算
5. Enabling Autoscaling for In-Memory Storage in Cluster Computing Framework [D] . Shrestha, Bibek Raj 2019

机译：在群集计算框架中为内存中存储启用自动缩放
6. A searchable personal health records framework with fine-grained access control in cloud-fog computing [O] . Jin Sun, Xiaojing Wang, Shangping Wang, -1

机译：云雾计算中具有细粒度访问控制的可搜索个人健康记录框架
7. A novel homomorphic RASD framework for secured data access and storage in cloud computing [O] . Rachna Jain, Anand Nayyar 2020

机译：用于云计算中的安全数据访问和存储的新型同性恋RASD框架

A Task-Aware Fine-Grained Storage Selection Mechanism for In-Memory Big Data Computing Frameworks

摘要

著录项

相似文献

相关主题

期刊订阅