首页> 外文会议>International conference on management of data >Efficient Processing of Data Warehousing Queries in a Split Execution Environment
【24h】

Efficient Processing of Data Warehousing Queries in a Split Execution Environment

机译:在拆分执行环境中高效处理数据仓库查询

获取原文
获取外文期刊封面目录资料

摘要

Hadapt is a start-up company currently commercializing the Yale University research project called HadoopDB. The company focuses on building a platform for Big Data analytics in the cloud by introducing a storage layer optimized for structured data and by providing a framework for executing SQL queries efficiently. This work considers processing data warehousing queries over very large datasets. Our goal is to maximize performance while, at the same time, not giving up fault tolerance and scalability. We analyze the complexity of this problem in the split execution environment of HadoopDB. Here, incoming queries are examined; parts of the query are pushed down and executed inside the higher performing database layer; and the rest of the query is processed in a more generic MapReduce framework. In this paper, we discuss in detail performance-oriented query execution strategies for data warehouse queries in split execution environments, with particular focus on join and aggregation operations. The efficiency of our techniques is demonstrated by running experiments using the TPCH benchmark with 3TB of data. In these experiments we compare our results with a standard commercial parallel database and an open-source MapReduce implementation featuring a SQL interface (Hive). We show that HadoopDB successfully competes with other systems.
机译:Hadapt是一家新兴公司,目前将耶鲁大学研究项目HadoopDB商业化。该公司致力于通过引入针对结构化数据进行了优化的存储层并提供有效执行SQL查询的框架,致力于在云中构建用于大数据分析的平台。这项工作考虑处理非常大的数据集上的数据仓库查询。我们的目标是在不放弃容错能力和可扩展性的同时,最大限度地提高性能。我们分析了HadoopDB拆分执行环境中此问题的复杂性。在这里,检查传入的查询;查询的一部分被下推并在更高性能的数据库层内执行;其余查询则在更通用的MapReduce框架中进行处理。在本文中,我们详细讨论了拆分执行环境中数据仓库查询的面向性能的查询执行策略,尤其侧重于联接和聚合操作。通过使用TPCH基准测试和3TB数据进行实验,证明了我们技术的效率。在这些实验中,我们将结果与标准的商业并行数据库和具有SQL接口(Hive)的开源MapReduce实现进行了比较。我们证明了HadoopDB与其他系统成功竞争。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号