To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload

机译：重叠还是不重叠：针对按需数据上传优化增量MapReduce计算

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Research on cloud-based Big Data analytics has focused so far on optimizing the performance and cost-effectiveness of the computations, while largely neglecting an important aspect: users need to upload massive datasets on clouds for their computations. This paper studies the problem of running MapReduce applications when considering the simultaneous optimization of performance and cost of both the data upload and its corresponding computation taken together. We analyze the feasibility of incremental MapReduce approaches to advance the computation as much as possible during the data upload by using already transferred data to calculate intermediate results. Our key finding shows that overlapping the transfer time with as many incremental computations as possible is not always efficient: a better solution is to wait for enough to fill the computational capacity of the MapReduce cluster. Results show significant performance and cost reduction compared with state-of-the-art solutions that leverage incremental computations in a naive fashion.

机译：迄今为止，对基于云的大数据分析的研究一直专注于优化计算的性能和成本效益，而在很大程度上忽略了一个重要方面：用户需要将大量数据集上传到云中进行计算。当考虑同时优化数据上传和相应计算的性能和成本时，本文研究了运行MapReduce应用程序的问题。通过使用已经传输的数据来计算中间结果，我们分析了增量MapReduce方法在数据上传期间尽可能提高计算速度的可行性。我们的主要发现表明，传输时间与尽可能多的增量计算重叠并不总是有效的：更好的解决方案是等待足够的时间来填充MapReduce集群的计算能力。与以天真的方式利用增量计算的最新解决方案相比，结果显示出显着的性能和成本降低。

著录项

来源
《DataCloud 2014: 5th International Workshop on Data Intensive Computing in the Clouds, Held in conjunction with SC14: The International Conference for High Performance Computing, Networking, Storage and Analysis》|2014年|9-16|共8页
会议地点 New Orleans LA(US)
作者
Ene Stefan; Nicolae Bogdan; Costan Alexandru; Antoniu Gabriel;
展开▼
作者单位

Univ. Politeh. Bucharest, Bucharest, Romania;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Big Data; data analysis; parallel processing; MapReduce applications; cloud-based big data analytics; computational capacity; incremental MapReduce computation optimization; on-demand data upload; performance optimization; transfer time; Algorithm design and analysis; Cloud computing; Computational modeling; Context; Data models; Data transfer; Throughput; MapReduce; data management; incremental processing;

机译：大数据;数据分析;并行处理; MapReduce应用程序;基于云的大数据分析;计算能力;增量MapReduce计算优化;按需数据上传;性能优化;传输时间;算法设计和分析;云计算;计算建模;上下文;数据模型;数据传输;吞吐量; MapReduce;数据管理;增量处理；;

相似文献

外文文献
中文文献
专利

1. MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy [J] . Hisham Mohamed, Stephane Marchand-Maillet Parallel Computing . 2013,第12期

机译：MRO-MPI：使用MPI和优化的数据交换策略进行MapReduce重叠
2. An Improved Algorithm for Optimizing MapReduce Based on Locality and Overlapping [J] . Jianjiang Li, Jie Wang, Bin Lyu, 清华大学学报（英文版） . 2018,第006期

机译：基于局部性和重叠性的MapReduce优化算法的改进
3. Joint optimization of overlapping phases in MapReduce [J] . Minghong Lin, Li Zhang, Adam Wierman, Performance Evaluation . 2013,第10期

机译：MapReduce中重叠阶段的联合优化
4. To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload [C] . Ene Stefan, Nicolae Bogdan, Costan Alexandru, International Workshop on Data Intensive Computing in the Clouds;International Conference for High Performance Computing, Networking, Storage and Analysis . 2014

机译：重叠或不重叠：优化以按需数据上传的增量MAPReduce计算
5. Bayesian inference with overlapping data: Methodology and application to system reliability estimation and sensor placement optimization [D] . Jackson, Christopher Stephen 2011

机译：贝叶斯推理与重叠数据：方法论及其在系统可靠性评估和传感器放置优化中的应用
6. Computational solution of spike overlapping using data-based subtraction algorithms to resolve synchronous sympathetic nerve discharge [O] . Chun-Kuei Su, Chia-Hsun Chiang, Chia-Ming Lee, 2013

机译：使用基于数据的减法算法解决同步交感神经放电的尖峰重叠的计算解决方案
7. To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload [O] . Ene, Stefan, Nicolae, Bogdan, Costan, Alexandru, 2014

机译：重叠还是不重叠：针对按需数据上传优化增量MapReduce计算

To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅