首页> 外文OA文献 >Optimization of Progressive Queries via Materialized Views for Large Databases
【2h】

Optimization of Progressive Queries via Materialized Views for Large Databases

机译:通过物化视图优化大型数据库的渐进查询

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

There is an increasing demand to efficiently process emerging types of queries, such as progressive queries (PQ), on large scale databases from numerous contemporary applications including telematics, e-commerce, and social media. Unlike a conventional query, a PQ consists of a set of interrelated step-queries (SQ). A user formulates a new SQ on the fly based on the result(s) from the previously executed SQ(s). Processing PQs raises a number of new challenges. Existing database management systems were not designed to efficiently process such queries. In this dissertation, we propose a suite of novel materialized-view based techniques to efficiently process PQs. First, we propose a dynamic materialized-view based approach to efficiently processing a special type ofPQs, called monotonic linear PQs. We introduce a so-called superior relationship graph to capture superior relationships among SQs of such a PQ and suggest a method to estimate the benefit of keeping the result of an SQ as a materialized view using the graph. To efficiently construct the superior relationship graph, we propose two algorithms: generating-based and pruning-based. To improve the view searching efficiency and quality, we design an algorithm with a special storage structure to store and manage the materialized views. Second, to handle generic PQs, we define a so-called multiple query dependency graph to capture the data source dependency relationships that exist among SQs and external tables of a generic PQ. Using the graph, a mathematical benefit estimation model, which takes both the impact and the effectiveness of materialization into consideration, is derived. A greedy method and a dynamic programming method to solve the view maintenance problem are proposed. Third, to efficiently find usable materialized views from the view space/set for answering a given SQ, we suggest a dynamic materialized view index method. A special index tree structure with nodes ordered by a two-level priority rule that facilitates efficient locating of different types of nodes is designed. Bitmaps encoded with special methods are also used to refine the pruning of unusable views during a search. Fourth, to support PQs in a big data environment like Hadoop, we propose an index based technique for performing a newcolumn family join operation on Hbase tables. To efficiently process such a join operation, we suggest a multiple freedom family index. A parallel MapReduce algorithm to construct the index is developed. To perform a column family join on two Hbase tables using the indexes, we present two partitioning methods to balance the workload among map nodes in a MapReduce algorithm. The introduced column family join operation and its relevant processing technique can ensure the closure property that is essential to the processing of PQs. To examine the performance of the proposed techniques, we performed extensive empirical and theoretical analyses. Our studies show that the proposed techniques are quite promising in efficiently processing PQs. To our knowledge, our work is the first to apply the materialized-view based approach to efficiently processing progressive queries on large databases.
机译:人们越来越需要在来自远程信息处理,电子商务和社交媒体等众多现代应用程序的大型数据库上有效处理新兴类型的查询,例如渐进式查询(PQ)。与常规查询不同,PQ由一组相互关联的步骤查询(SQ)组成。用户基于先前执行的SQ的结果即时制定新的SQ。处理PQ提出了许多新挑战。现有的数据库管理系统并未设计为有效处理此类查询。本文提出了一套新颖的基于物化视图的技术来有效地处理PQ。首先,我们提出了一种基于动态物化视图的方法,可以有效地处理称为单调线性PQ的特殊类型的PQ。我们引入了一种所谓的“高级关系图”来捕获此类PQ的SQ之间的高级关系,并提出了一种使用该图来估计将SQ的结果保留为物化视图的好处的方法。为了有效地构造上级关系图,我们提出了两种算法:基于生成的算法和基于修剪的算法。为了提高视图搜索的效率和质量,我们设计了一种具有特殊存储结构的算法来存储和管理实例化视图。其次,为处理通用PQ,我们定义了一个所谓的多重查询依赖图,以捕获通用PQ的SQ和外部表之间存在的数据源依赖关系。使用该图,得出了一个数学收益估算模型,该模型同时考虑了实现的影响和有效性。提出了一种贪婪方法和动态规划方法来解决视图维护问题。第三,为了有效地从视图空间/集合中找到可用的实体化视图以回答给定的SQ,我们建议使用动态的实体化视图索引方法。设计了一种特殊的索引树结构,其节点由两级优先级规则排序,这有助于有效定位不同类型的节点。使用特殊方法编码的位图还可以用于优化搜索过程中无法使用的视图的修剪。第四,为了在像Hadoop这样的大数据环境中支持PQ,我们提出了一种基于索引的技术,用于对Hbase表执行newcolumn系列联接操作。为了有效地处理这样的联接操作,我们建议使用多重自由家庭索引。开发了一种并行的MapReduce算法来构建索引。为了使用索引对两个Hbase表执行列族联接,我们提出了两种分区方法来平衡MapReduce算法中地图节点之间的工作量。引入的列族联接操作及其相关的处理技术可以确保对PQ的处理至关重要的封闭特性。为了检查所提出技术的性能,我们进行了广泛的经验和理论分析。我们的研究表明,所提出的技术在有效处理PQ中非常有前途。据我们所知,我们的工作是第一个应用基于物化视图的方法来有效处理大型数据库上的渐进式查询的工作。

著录项

  • 作者

    Zhu Chao;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号