Performance Optimization System for Hadoop and Spark Frameworks

Hrachya Astsatryan; Aram Kocharyan; Daniel Hagimont; Arthur Lalayan

首页> 外文期刊>Cybernetics and information technologies: CIT >Performance Optimization System for Hadoop and Spark Frameworks

【24h】

Performance Optimization System for Hadoop and Spark Frameworks

机译：Hadoop和Spark框架的性能优化系统

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The optimization of large-scale data sets depends on the technologies andmethods used. The MapReduce model, implemented on Apache Hadoop or Spark,allows splitting large data sets into a set of blocks distributed on several machines.Data compression reduces data size and transfer time between disks and memory butrequires additional processing. Therefore, finding an optimal tradeoff is a challenge,as a high compression factor may underload Input/Output but overload theprocessor. The paper aims to present a system enabling the selection of thecompression tools and tuning the compression factor to reach the best performancein Apache Hadoop and Spark infrastructures based on simulation analyzes.

机译：大规模数据集的优化取决于所使用的技术和方法。在Apache Hadoop或Spark上实现的MapReduce模型允许将大数据集分成分布在多个机器上的一组块中.Data压缩降低了磁盘和内存之间的数据大小和传输时间，这是额外的处理。因此，找到最佳权衡是一个挑战，因为高压缩因子可能欠输入/输出而不是过载处理器。本文旨在提出一个系统，可以选择要选择的压缩工具并调整压缩因子，以基于仿真分析来达到最佳性能Apache Hadoop和Spark基础架构。

著录项

来源
《Cybernetics and information technologies: CIT》 |2020年第6期|共13页
作者
Hrachya Astsatryan; Aram Kocharyan; Daniel Hagimont; Arthur Lalayan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
HadoopSparkdata compressionCPU/IO tradeoffperformance optimization.;

机译：Hadoopsparkdata Compressclectcpu / io TradeOffPerformance优化。;

相似文献

外文文献
中文文献
专利

1. Performance comparison between Hadoop and Spark frameworks using HiBench benchmarks [J] . Yassir Samadi, Mostapha Zbakh, Claude Tadonki Concurrency, practice and experience . 2018,第12期

机译：使用HiBench基准测试的Hadoop和Spark框架之间的性能比较
2. Performances Evaluation of a Novel Hadoop and Spark Based System of Image Retrieval for Huge Collections [J] . Luca Costantini, Raffaele Nicolussi Advances in multimedia . 2015,第期

机译：新型基于Hadoop和Spark的大型馆藏图像检索系统的性能评估
3. High Performance Computation of Big Data: Performance Optimization Approach towards a Parallel Frequent Item Set Mining Algorithm for Transaction Data based on Hadoop MapReduce Framework [J] . Guru Prasad M S, Nagesh H R, Swathi Prabhu International Journal of Intelligent Systems and Applications . 2017,第1期

机译：大数据的高性能计算：基于Hadoop MapReduce框架的事务数据并行频繁项集挖掘算法的性能优化方法
4. Impact of Map-Reduce framework on Hadoop and Spark MR Application Performance [C] . Ishaan Lagwankar, Ananth Narayan Sankaranarayanan, Subramaniam Kalambur IEEE International Conference on Big Data . 2020

机译：地图 - 减少框架对Hadoop和Spark MR应用程序性能的影响
5. Performance comparison by running benchmarks on Hadoop, Spark, and HAMR. [D] . Liu, Lu. 2015

机译：通过在Hadoop，Spark和HAMR上运行基准测试来进行性能比较。
6. Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce [O] . Ablimit Aji, Fusheng Wang, Hoang Vo, -1

机译：Hadoop-GIS：基于MapReduce的高性能空间数据仓库系统
7. Performance Evaluation of Distributed Computing Environments with Hadoop and Spark Frameworks [O] . Taran, Vladyslav, Alienin, Oleg, Stirenko, Sergii, 2017

机译：基于Hadoop的分布式计算环境性能评估和spark框架

Performance Optimization System for Hadoop and Spark Frameworks

摘要

著录项

相似文献

相关主题

期刊订阅