首页> 外文期刊>Wiley interdisciplinary reviews. Data mining and knowledge discovery >Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks
【24h】

Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks

机译:利用云计算进行大数据:对计算环境,MapReduce和编程框架的深入了解

获取原文
获取原文并翻译 | 示例
       

摘要

The term Big Data' has spread rapidly in the framework of Data Mining and Business Intelligence. This new scenario can be defined by means of those problems that cannot be effectively or efficiently addressed using the standard computing resources that we currently have. We must emphasize that Big Data does not just imply large volumes of data but also the necessity for scalability, i.e., to ensure a response in an acceptable elapsed time. When the scalability term is considered, usually traditional parallel-type solutions are contemplated, such as the Message Passing Interface or high performance and distributed Database Management Systems. Nowadays there is a new paradigm that has gained popularity over the latter due to the number of benefits it offers. This model is Cloud Computing, and among its main features we has to stress its elasticity in the use of computing resources and space, less management effort, and flexible costs. In this article, we provide an overview on the topic of Big Data, and how the current problem can be addressed from the perspective of Cloud Computing and its programming frameworks. In particular, we focus on those systems for large-scale analytics based on the MapReduce scheme and Hadoop, its open-source implementation. We identify several libraries and software projects that have been developed for aiding practitioners to address this new programming model. We also analyze the advantages and disadvantages of MapReduce, in contrast to the classical solutions in this field. Finally, we present a number of programming frameworks that have been proposed as an alternative to MapReduce, developed under the premise of solving the shortcomings of this model in certain scenarios and platforms. WIREs Data Mining Knowl Discov 2014, 4:380-409. doi: 10.1002/widm.1134 For further resources related to this article, please visit the . Conflict of interest: The authors have declared no conflicts of interest for this article.
机译:“大数据”一词在数据挖掘和商业智能的框架中迅速传播。可以通过使用我们目前拥有的标准计算资源无法有效解决的问题来定义此新方案。我们必须强调,大数据不仅意味着大量数据,而且还具有可伸缩性的必要性,即确保在可接受的经过时间内做出响应。当考虑可伸缩性术语时,通常会考虑传统的并行类型解决方案,例如消息传递接口或高性能和分布式数据库管理系统。如今,由于提供了许多好处,因此出现了一种新的范式,它比后者更受欢迎。该模型是云计算,在其主要功能中,我们必须强调其在使用计算资源和空间,减少管理工作以及灵活的成本方面的灵活性。在本文中,我们提供了有关大数据主题的概述,以及如何从云计算及其编程框架的角度解决当前问题。特别是,我们将重点放在那些基于MapReduce方案和Hadoop(其开源实现)的大规模分析系统上。我们确定了一些已开发的库和软件项目,以帮助从业人员解决这种新的编程模型。与该领域的经典解决方案相比,我们还分析了MapReduce的优缺点。最后,我们提出了许多编程框架,这些框架是在解决某些场景和平台中该模型的缺点的前提下开发的,可替代MapReduce。 WIRES Data Mining Knowl Discov 2014,4:380-409。 doi:10.1002 / widm.1134有关与本文相关的更多资源,请访问。利益冲突:作者在本文中没有任何利益冲突。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号