Understanding and Combating Memory Bloat in Managed Data-Intensive Systems

Khanh Nguyen; Wang Kai; Bu Yingyi; Fang Lu; Xu Guoqing

首页> 外文期刊>ACM transactions on software engineering and methodology >Understanding and Combating Memory Bloat in Managed Data-Intensive Systems

【24h】

Understanding and Combating Memory Bloat in Managed Data-Intensive Systems

机译：了解和应对托管数据密集型系统中的内存膨胀

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The past decade has witnessed increasing demands on data-driven business intelligence that led to the proliferation of data-intensive applications. A managed object-oriented programming language such as Java is often the developer's choice for implementing such applications, due to its quick development cycle and rich suite of libraries and frameworks. While the use of such languages makes programming easier, their automated memory management comes at a cost. When the managed runtime meets large volumes of input data, memory bloat is significantly magnified and becomes a scalability-prohibiting bottleneck.This article first studies, analytically and empirically, the impact of bloat on the performance and scalability of large-scale, real-world data-intensive systems. To combat bloat, we design a novel compiler framework, called FACADE, that can generate highly efficient data manipulation code by automatically transforming the data path of an existing data-intensive application. The key treatment is that in the generated code, the number of runtime heap objects created for data classes in each thread is (almost) statically bounded, leading to significantly reduced memory management cost and improved scalability. We have implemented FACADE and used it to transform seven common applications on three real-world, already well-optimized data processing frameworks: GraphChi, Hyracks, and GPS. Our experimental results are very positive: the generated programs have (1) achieved a 3% to 48% execution time reduction and an up to 88x GC time reduction, (2) consumed up to 50% less memory, and (3) scaled to much larger datasets.

机译：在过去的十年中，见证了对数据驱动型商业智能的日益增长的需求，从而导致了数据密集型应用程序的泛滥。诸如Java之类的面向对象的托管编程语言，由于其快速的开发周期以及丰富的库和框架套件，通常是开发人员选择实现此类应用程序的选择。尽管使用此类语言使编程变得更容易，但它们的自动内存管理需要付出一定的代价。当托管运行时遇到大量输入数据时，内存膨胀将被放大，并成为禁止伸缩性的瓶颈。本文首先以分析和经验的方式研究膨胀对大规模，真实世界的性能和可伸缩性的影响数据密集型系统。为了避免膨胀，我们设计了一个新颖的编译器框架，称为FACADE，可以通过自动转换现有数据密集型应用程序的数据路径来生成高效的数据处理代码。关键在于，在生成的代码中，为每个线程中的数据类创建的运行时堆对象的数量（几乎）是静态限制的，从而显着降低了内存管理成本并提高了可伸缩性。我们已经实现了FACADE，并使用它来转换了三个现实世界中已经充分优化的数据处理框架（GraphChi，Hyrack和GPS）上的七个通用应用程序。我们的实验结果是非常积极的：生成的程序（1）减少了3％至48％的执行时间，将GC时间减少了88倍，（2）消耗的内存减少了50％，并且（3）扩展为更大的数据集。

著录项

来源
《ACM transactions on software engineering and methodology》 |2017年第4期|12.1-12.41|共41页
作者
Khanh Nguyen; Wang Kai; Bu Yingyi; Fang Lu; Xu Guoqing;
展开▼
作者单位

Univ Calif Irvine Dept Comp Sci Bren Hall Irvine CA 92697 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Languages; Measurement; Performance; Big data; managed languages; memory management; performance optimization;

机译：语言;测量;性能;大数据;托管语言;内存管理;性能优化;

相似文献

外文文献
中文文献
专利

1. New POWER7 Systems Manage Data-Intensive Services [J] . Systems & Network Management Journal . 2010,第4期

机译：新的POWER7系统管理数据密集型服务
2. Resistive Ternary Content Addressable Memory Systems for Data-Intensive Computing [J] . Guo Qing, Guo Xiaochen, Bai Yuxin, Micro, IEEE . 2015,第5期

机译：用于数据密集型计算的电阻式三态内容可寻址存储系统
3. A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System [J] . JEFFREY DRAPER, J. TIM BARRETT, JEFF SONDEEN, Journal of VLSI signal processing systems . 2005,第1期

机译：用于数据密集型架构（DIVA）系统的原型内存中处理（PIM）芯片
4. Mapping data-intensive applications to an explicitly managed memory architecture: Challenges and solutions [C] . Paulin Pierre G. IEEE Symposium on Embedded Systems for Real-time Multimedia . 2013

机译：将数据密集型应用程序映射到显式管理的内存体系结构：挑战和解决方案
5. Managing and Exploiting Flash-based Storage for Data-intensive Systems. [D] . Park, James, Stan. 2014

机译：为数据密集型系统管理和利用基于闪存的存储。
6. Multicompartment Ecosystem Mass Balances as a Tool for Understanding and Managing the Biogeochemical Cycles of Human Ecosystems [O] . Lawrence A. Baker, Diane Hope, Ying Xu, 2001

机译：多隔室生态系统质量平衡作为了解和管理人类生态系统生物地球化学循环的工具
7. Understanding and Combating Memory Bloat in Managed Data-Intensive Systems [O] . Khanh Nguyen, Kai Wang, Yingyi Bu, 2018

机译：在托管数据密集型系统中了解和打击内存膨胀

Understanding and Combating Memory Bloat in Managed Data-Intensive Systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅