Optimizing big data processing performance in the public cloud: opportunities and approaches

Wang Dan; Liu Jiangchuan

首页> 外文期刊>Network, IEEE >Optimizing big data processing performance in the public cloud: opportunities and approaches

【24h】

Optimizing big data processing performance in the public cloud: opportunities and approaches

机译：在公共云中优化大数据处理性能：机遇和方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Today???s lightning fast data generation from massive sources is calling for efficient big data processing, which imposes unprecedented demands on the computing and networking infrastructures. State-of-the-art tools, most notably MapReduce, are generally performed on dedicated server clusters to explore data parallelism. For grass roots users or non-computing professionals, the cost of deploying and maintaining a large-scale dedicated server clusters can be prohibitively high, not to mention the technical skills involved. On the other hand, public clouds allow general users to rent virtual machines and run their applications in a pay-as-you-go manner with ultra-high scalability with minimal upfront costs. This new computing paradigm has gained tremendous success in recent years, becoming a highly attractive alternative to dedicated server clusters. This article discusses the critical challenges and opportunities when big data meet the public cloud. We identify the key differences between running big data processing in a public cloud and in dedicated server clusters. We then present two important problems for efficient big data processing in the public cloud, resource provisioning (i.e., how to rent VMs) and VM-MapReduce job/task scheduling (i.e., how to run MapReduce after the VMs are constructed). Each of these two questions have a set of problems to solve. We present solution approaches for certain problems, and offer optimized design guidelines for others. Finally, we discuss our implementation experiences.

机译：如今，从海量数据源中快速生成闪电数据，要求高效的大数据处理，这对计算和网络基础结构提出了前所未有的要求。通常在专用服务器群集上执行最先进的工具，尤其是MapReduce，以探索数据并行性。对于基层用户或非计算专业人员，部署和维护大规模专用服务器群集的成本可能会高得惊人，更不用说所涉及的技术技能了。另一方面，公共云使普通用户可以租用虚拟机并按需付费，以极高的可扩展性以最小的前期成本运行他们的应用程序。近年来，这种新的计算范例取得了巨大的成功，成为专用服务器集群的极具吸引力的替代方案。本文讨论了大数据遇到公共云时的关键挑战和机遇。我们确定了在公共云和专用服务器集群中运行大数据处理之间的主要区别。然后，我们提出了在公共云中进行有效的大数据处理的两个重要问题，即资源调配（即如何租用VM）和VM-MapReduce作业/任务调度（即如何在构建VM之后运行MapReduce）。这两个问题中的每一个都有一组要解决的问题。我们为某些问题提供解决方案，并为其他问题提供优化的设计准则。最后，我们讨论我们的实施经验。

著录项

来源
《Network, IEEE》 |2015年第5期|31-35|共5页
作者
Wang Dan; Liu Jiangchuan;
展开▼
作者单位

Department of Computing, Hong Kong Polytechnic University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Cloud computing model for big data processing and performance optimization of multimedia communication [J] . Zhou Zhicheng, Zhao Liang Computer Communications . 2020,第Jula期

机译：多媒体通信大数据处理和性能优化的云计算模型
2. Optimizing energy consumption for a performance-aware cloud data center in the public sector [J] . Chang Kyungmee, Park Sangun, Kong Hyesoo, Sustainable Computing . 2018,第DECa期

机译：为公共部门的性能感知型云数据中心优化能耗
3. Benchmarking big data architectures for social networks data processing using public cloud platforms [J] . Valerio Persico, Antonio Pescapé, Antonio Picariello, Future generation computer systems . 2018,第DECa期

机译：使用公共云平台对社交网络数据处理的大数据架构进行基准测试
4. An Architecture for Cost Optimization in the Processing of Big Geospatial Data in Public Cloud Providers [C] . Joao Bachiega, Marco Antonio Sousa Reis, Maristela Holanda, 2018 IEEE International Congress on Big Data . 2018

机译：公共云提供商中处理大地理空间数据的成本优化架构
5. Optimizing Performance and Security in Migrating Data between Non-cloud Infrastructure and Cloud Using Parallel Computing and Reed-Solomon (RS) Code [D] . Alkhonaini, Mimouna Abdullah. 2018

机译：使用并行计算和Reed-Solomon（RS）代码在非云基础架构和云之间的数据迁移中优化性能和安全性
6. Medical Cloud Computing Data Processing to Optimize the Effect of Drugs [O] . Fengxia Li, Zhi Qu, Ruiling Li 2021

机译：医疗云计算数据处理以优化药物的效果
7. Optimizing big data processing performance in the public cloud : opportunities and approaches [O] . Wang, D, Liu, J 2015

机译：在公共云中优化大数据处理性能：机遇和方法

Optimizing big data processing performance in the public cloud: opportunities and approaches

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅