首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Demythization of Structural XML Query Processing: Comparison of Holistic and Binary Approaches
【24h】

Demythization of Structural XML Query Processing: Comparison of Holistic and Binary Approaches

机译:结构XML查询处理的解除化:整体和二进制方法的比较

获取原文
获取原文并翻译 | 示例

摘要

XML queries can be modeled by twig pattern queries (TPQs) specifying predicates on XML nodes and XPath relationships satisfied between them. A lot of TPQ types have been proposed; this paper takes into account a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery semantics and, in many cases, it leads to a more efficient query processing. In general, there are two types of approaches to process a TPQ: holistic joins and binary joins. Whereas the binary join approach builds a query plan as a tree of interconnected binary operators, the holistic join approach evaluates a whole query using one operator (i.e., using one complex algorithm). Surprisingly, a thorough analytical and experimental comparison is still missing despite an enormous research effort in this area. In this paper, we try to fill this gap; we analytically and experimentally show that the binary joins used in a fully-pipelined plan (i.e., the plan where each join operation does not wait for the complete result of the previous operation and no explicit sorting is used) can often outperform the holistic joins, especially for TPQs with a higher ratio of non-output query nodes. The main contributions of this paper can be summarized as follows: (i) we introduce several improvements of existing binary join approaches allowing to build a fully-pipelined plan for a TPQ considering non-output query nodes, (ii) we prove that for a certain class of TPQs such a plan has the linear time complexity with respect to the size of the input and output as well as the linear space complexity with respect to the XML document depth (i.e., the same complexity as the holistic join approaches), (iii) we show that our improved binary join approach outperforms the holistic join approaches in many situations, and (iv) we propose a simple combined approach that utilizes advantages of both types of approaches.
机译:XML查询可以由枝花模式查询(TPQ)建模,指定XML节点的谓词和它们之间满足XPATH关系。已经提出了许多TPQ类型;本文考虑了由输出和非输出查询节点的规范扩展的TPQ模型,因为它符合XQuery语义,并且在许多情况下,它导致更有效的查询处理。通常,处理TPQ的两种方法:整体连接和二进制加入。虽然二进制连接方法将查询计划作为互连二进制运算符的树构建,但整体连接方法使用一个操作员(即,使用一种复杂算法)来评估整个查询。令人惊讶的是,尽管该领域有巨大的研究努力,但仍然缺少彻底的分析和实验比较。在本文中,我们试图填补这个差距;我们分析和实验表明,在完全流水线计划中使用的二进制联接(即,每个连接操作不等待先前操作的完整结果,并且没有使用显式排序的计划)通常优于整体连接,特别是对于具有较高比例的非输出查询节点的TPQ。本文的主要贡献可以概括如下:(i)我们介绍了现有二进制加入方法的几种改进,允许考虑非输出查询节点的TPQ,我们证明了一个完全流水线计划某些类别的TPQS这样的计划具有关于输入和输出大小的线性时间复杂度以及相对于XML文档深度的线性空间复杂度(即,与整体连接方法相同的复杂性),( III)我们表明我们改进的二进制加入方法优于许多情况下整体连接方法,(iv)我们提出了一种利用两种类型方法的优势的简单组合方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号