A Study on Garbage Collection Algorithms for Big Data Environments

Bruno Rodrigo; Ferreira Paulo

首页> 外文期刊>ACM Computing Surveys >A Study on Garbage Collection Algorithms for Big Data Environments

【24h】

A Study on Garbage Collection Algorithms for Big Data Environments

机译：大数据环境下的垃圾收集算法研究

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The need to process and store massive amounts of data-Big Data-is a reality. In areas such as scientific experiments, social networks management, credit card fraud detection, targeted advertisement, and financial analysis, massive amounts of information are generated and processed daily to extract valuable, summarized information. Due to its fast development cycle (i.e., less expensive to develop), mainly because of automatic memory management, and rich community resources, managed object-oriented programming languages (e.g., Java) are the first choice to develop Big Data platforms (e.g., Cassandra, Spark) on which such Big Data applications are executed.However, automatic memory management comes at a cost. This cost is introduced by the garbage collector, which is responsible for collecting objects that are no longer being used. Although current (classic) garbage collection algorithms may be applicable to small-scale applications, these algorithms are not appropriate for large-scale Big Data environments, as they do not scale in terms of throughput and pause times.In this work, current Big Data platforms and their memory profiles are studied to understand why classic algorithms (which are still the most commonly used) are not appropriate, and also to analyze recently proposed and relevant memory management algorithms, targeted to Big Data environments. The scalability of recent memory management algorithms is characterized in terms of throughput (improves the throughput of the application) and pause time (reduces the latency of the application) when compared to classic algorithms. The study is concluded by presenting a taxonomy of the described works and some open problems, with regard to Big Data memory management, that could be addressed in future works.

机译：处理和存储大量数据（大数据）的需求已成为现实。在科学实验，社交网络管理，信用卡欺诈检测，目标广告和财务分析等领域，每天都会生成并处理大量信息，以提取有价值的摘要信息。由于其快速的开发周期（即，开发成本较低）（主要是由于自动内存管理和丰富的社区资源），托管的面向对象编程语言（例如Java）是开发大数据平台（例如，执行此类大数据应用程序的Cassandra，Spark），但是自动内存管理需要付出一定的代价。该成本由垃圾收集器引入，垃圾收集器负责收集不再使用的对象。尽管当前的（经典）垃圾收集算法可能适用于小型应用程序，但是这些算法不适用于大规模的大数据环境，因为它们无法在吞吐量和暂停时间方面进行扩展。对平台及其内存配置文件进行了研究，以了解为何经典算法（仍是最常用的算法）不合适的原因，并分析了针对大数据环境的最近提出的相关内存管理算法。与经典算法相比，最新的内存管理算法的可伸缩性以吞吐量（提高应用程序的吞吐量）和暂停时间（减少应用程序的延迟）为特征。通过提出上述工作的分类法和有关大数据内存管理的一些未解决的问题，可以完成本研究的结论，这些问题可以在以后的工作中解决。

著录项

来源
《ACM Computing Surveys》 |2018年第1期|20.1-20.35|共35页
作者
Bruno Rodrigo; Ferreira Paulo;
展开▼
作者单位

Univ Lisbon, Inst Super Tecn, INESC ID, Lisbon, Portugal;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Garbage collection; Big Data; processing platforms; storage platform; Java; memory managed runtime; scalability; Big Data environment;

机译：垃圾收集;大数据;处理平台;存储平台;Java;内存管理的运行时;可扩展性;大数据环境;

相似文献

外文文献
中文文献
专利

1. Using Passive Object Garbage Collection Algorithms for Garbage Collection of Active Objects [J] . Abhay Vardhan, Gul Agha ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2003,第2期

机译：使用被动对象垃圾收集算法对主动对象进行垃圾收集
2. Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data [J] . Benny Van Houdt Performance Evaluation . 2013,第10期

机译：具有热/冷数据的基于闪存的固态驱动器的垃圾收集算法的性能
3. Distributed Garbage Collection Algorithms for Timestamped Data [J] . Ramachandran U., Knobe K., Harel N., IEEE Transactions on Parallel and Distributed Systems . 2006,第期

机译：带时间戳的数据的分布式垃圾收集算法
4. Using Passive Object Garbage Collection Algorithms for Garbage Collection of Active Objects [C] . Abhay Vardhan, Gul Agha Third ACM SIGPLAN International Symposium on Memory Management; Jun 20-21, 2002; Berlin, Germany . 2002

机译：使用被动对象垃圾收集算法对主动对象进行垃圾收集
5. Memory management and garbage collection algorithms for Java-based Prolog. [D] . Zhou, Qinan. 2001

机译：基于Java的Prolog的内存管理和垃圾回收算法。
6. Multi-position data collection and dynamic beam sizing: recent improvements to the automatic data-collection algorithms on MASSIF-1 [O] . Olof Svensson, Maciej Gilski, Didier Nurizzo, -1

机译：多位置数据收集和动态波束大小调整：MASSIF-1上自动数据收集算法的最新改进
7. Using Passive Object Garbage Collection Algorithms for Garbage Collection of Active Objects [O] . Abhay Vardhan, Gul Agha 2002

机译：使用被动对象垃圾收集算法进行活动对象的垃圾收集
8. Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-State Drives. [R] . Lee, J., Kim, Y., Oral, S., 2013

机译：比较固态驱动阵列的协调垃圾收集算法。

A Study on Garbage Collection Algorithms for Big Data Environments

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅