首页> 外文会议>International conference on management of data >Oracle In-Database Hadoop:When MapReduce Meets RDBMS
【24h】

Oracle In-Database Hadoop:When MapReduce Meets RDBMS

机译:Oracle数据库内Hadoop:当MapReduce遇到RDBMS时

获取原文

摘要

Big.data is the tar sands of the data world: vast reserves of raw gritty data whose valuable information content can only be extracted at great cost. MapReduce is a popular parallel programming paradigm well suited to the programmatic extraction and analysis of information from these unstructured Big Data reserves. The Apache Hadoop implementation of MapReduce has become an important player in this market due to its ability to exploit large networks of inexpensive servers. The increasing importance of unstructured data has led to the interest in MapReduce and its Apache Hadoop implementation, which has led to the interest of data processing vendors in supporting this programming style. Oracle RDBMS has had support for the MapReduce paradigm for many years through the mechanism of user defined pipelined table functions and aggregation objects. However, such support has not been Hadoop source compatible. Native Hadoop programs needed to be rewritten before becoming usable in this framework. The ability to run Hadoop programs inside the Oracle database provides a versatile solution to database users, allowing them use programming skills they may already possess and to exploit the growing Hadoop eco-system. In this paper, we describe a prototype of Oracle In-Database Hadoop that supports the running of native Hadoop applications written in Java. '.Phis implementation executes Hadoop applications using the efficient parallel capabilities of the Oracle database and a subset of the Apache Hadoop infrastructure. This system's target audience includes both SQL and Hadoop users. We discuss the architecture and design, and in particular, demonstrate how MapReduce functionalities are seamlessly integrated within SQL ciueries. We also share our experience in building such a system within Oracle database and follow-on topics that we think are promising areas for exploration.
机译:Big.data是数据世界的焦油沙:大量原始的粗砂数据,其宝贵的信息内容只能以高昂的代价提取。 MapReduce是一种流行的并行编程范例,非常适合从这些非结构化大数据储备中以编程方式提取和分析信息。由于MapReduce的Apache Hadoop实现能够利用廉价服务器的大型网络,因此已成为该市场的重要参与者。非结构化数据的重要性日益增长,引起了人们对MapReduce及其Apache Hadoop实现的兴趣,这也引起了数据处理供应商对支持这种编程风格的兴趣。多年来,Oracle RDBMS通过用户定义的流水线表功能和聚合对象的机制支持MapReduce范例。但是,此类支持尚未与Hadoop源兼容。在本框架中可用之前,需要重写本机Hadoop程序。在Oracle数据库中运行Hadoop程序的能力为数据库用户提供了一种通用的解决方案,使他们能够使用他们可能已经拥有的编程技能并利用不断发展的Hadoop生态系统。在本文中,我们描述了Oracle In-Database Hadoop的原型,该原型支持运行用Java编写的本地Hadoop应用程序。 '.Phis实施使用Oracle数据库和Apache Hadoop基础结构的子集的高效并行功能执行Hadoop应用程序。该系统的目标受众包括SQL和Hadoop用户。我们讨论了体系结构和设计,尤其是演示了如何将MapReduce功能无缝集成到SQL语言中。我们还将分享在Oracle数据库中构建这样的系统的经验,以及我们认为很有希望探索的后续主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号