首页> 外文会议>IEEE International Symposium on Parallel and Distributed Processing with Applications >A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms

【24h】

A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms

机译：用于内存数据分析平台的工作负载特定内存容量配置方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays, in-memory data analytic platforms, such as Spark, are widely adopted in big data processing. The proper memory capacity configuration has been proved to be an efficient way to guarantee the workload performance in such platforms. Currently, Spark adopts the static way to configure the memory capacity for workloads based on user specifications. However, due to the lack of deep knowledge of the target platform and workload characteristics, nonexpert users often conservatively configure the memory capacity in an excessive way, which reduces the memory utilization significantly. On the other hand, as the memory requirements are quite different among diverse workloads, there is not the one-size- fits-all solution for memory capacity configuration. Aiming on these issues, we propose WSMC, a workload-specific memory capacity configuration approach for the Spark workloads, which guides users on the memory capacity configuration with the accurate prediction of the workload's memory requirement under various input data size and parameter settings. First, WSMC classifies the in-memory computing workloads into four categories according to the workloads' Data Expansion Ratio. Second, WSMC establishes a memory requirement prediction model with the consideration of the input data size, the shuffle data size, the parallelism of the workloads and the data block size. For the ad-hoc workload, WSMC can profile its Data Expansion Ratio with small-sized input data and decide the category that the workload falls into. Users can then determine the accurate configuration in accordance with the corresponding memory requirement prediction.Through the comprehensive evaluations with SparkBench workloads, we found that, contrasting with the default configuration, configuration with the guide of WSMC can save over 40% memory capacity with the workload performance slight degradation (only 5%), and compared to the proper configuration found out manually, the configuration with the guide of WSMC leads to only 7% increase in the memory waste with the workload's performance slight improvement (about 1%).

机译：如今，在存储器内数据的分析平台，如火花，被广泛在大数据处理采用。适当的存储器容量的配置已经被证明是保证这样的平台的工作负载性能的有效方法。目前，星火采用静态的方式来配置根据用户的工作负载规格的内存容量。然而，由于缺乏对目标平台和工作负载特性很深的造诣，不熟练的用户往往保守配置过度的方式，这显著减少内存使用的内存容量。在另一方面，作为对内存的要求是多样化的工作负载中完全不同，不存在内存容量配置的一个一刀切的解决办法。针对这些问题，我们提出WSMC，为星火工作负载，工作负载特定的内存容量配置的办法，引导用户上的工作负载的内存需求下，各种输入数据的大小和参数设置准确预测存储容量配置。首先，WSMC根据工作负荷的数据膨胀率在内存中的工作负荷计算分类为四类。第二，WSMC建立与考虑输入数据的大小，混洗数据的大小，工作负荷的并行和数据块大小的存储器需求预测模型。对于临时工作量，WSMC可以分析与小型输入数据的数据膨胀率，并决定该类别的工作量落入。然后，用户可以决定在根据相应的存储器需求与prediction.Through SparkBench工作负荷的综合评价的精确配置，我们发现，与默认配置对比，与WSMC的引导结构可节省超过40与％的存储器容量工作负载性能略微下降（仅5 ％），并与正确的配置手动发现了，与WSMC引线的引导结构仅7 ％增加存储器废物与工作负载的性能略有改进（约1 ％）。

著录项

来源
《IEEE International Symposium on Parallel and Distributed Processing with Applications 》|2017年|721p|共5页
会议地点
作者
Yi Liang; Shilu Chang; Chao Su;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311-53;
关键词
Memory management; Sparks; Predictive models; Task analysis; Data models; Parallel processing; Data analysis;

机译：内存管理;火花;预测模型;任务分析;数据模型;并行处理;数据分析;

相似文献

外文文献
中文文献
专利

1. Optimizing the Analytical Value of Oncology-Related Data Based on an In-Memory Analysis Layer: Development and Assessment of the Munich Online Comprehensive Cancer Analysis Platform [J] . Daniel Nasseh, Sophie Schneiderbauer, Michael Lange, Journal of medical Internet research . 2020 ,第4期

机译：基于内存分析层优化义科相关数据的分析价值：慕尼黑在线综合癌症分析平台的开发和评估
2. Datasize-Aware High Dimensional Configurations Auto-Tuning of In-Memory Cluster Computing [J] . Zhibin Yu, Zhendong Bei, Xuehai Qian ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2018 ,第2期

机译：数据化感知高维配置内存内存集的自动调整
3. Eager Memory Management for In-Memory Data Analytics [J] . Hakbeom JANG, Jonghyun BAE, Tae Jun HAM, IEICE transactions on information and systems . 2019 ,第3期

机译：渴望内存管理，用于内存中数据分析
4. A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms [C] . Yi Liang, Shilu Chang, Chao Su 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications . 2017

机译：内存中数据分析平台的特定于工作负载的内存容量配置方法
5. Understanding Memory Configurations for In-Memory Analytics. [D] . Reiss, Charles Albert. 2016

机译：了解内存分析的内存配置。
6. Data Processing and Information Classification—An In-Memory Approach [O] . Milena Andrighetti, Giovanna Turvani, Giulia Santoro, 2020

机译：数据处理和信息分类-内存中方法
7. A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms [O] . Liang, Yi, Chang, Shilu, Su, Chao 2017

机译：内存中的特定于工作负载的内存容量配置方法数据分析平台

A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms

摘要

著录项

相似文献

相关主题

期刊订阅