首页> 外文会议>SIGMOD/PODS 2007 >Efficient Algorithms for Evaluating Xpath over Streams*
【24h】

Efficient Algorithms for Evaluating Xpath over Streams*

机译:评估流上的Xpath的高效算法*

获取原文

摘要

In this paper we address the problem of evaluating Xpath queries over streaming XML data. We consider a practical Xpath fragment called Univariate Xpath, which includes the commonly used '/' and '//' axes and allows ?-node tests and arbitrarily nested predicates. It is well known that this Xpath fragment can be efficiently evaluated in O(|D||Q|) time in the non-streaming environment [18], where |D| is the document size and |Q| is the query size. However, this is not necessarily true in the streaming environment, since streaming algorithms have to satisfy stricter requirement than non-streaming algorithms, in that all data must be read sequentially in one pass. Therefore, it is not surprising that state-of-the-art stream-querying algorithms have higher time complexity than O(|D||Q|). In this paper we revisit the Xpath stream-querying prob- lem, and show that Univariate Xpath can be efficiently eval- uated in O(|D||Q|) time in the streaming environment. Spe- cifically, we propose two O(|D||Q|)-time stream-querying al- gorithms, LQ and EQ, which are based on the lazy strategy and on the eager strategy, respectively. To the best of our knowledge, LQ and EQ are the first Xpath stream-querying algorithms that achieve O(|D||Q|) time performance. Fur- ther, our algorithms achieve O(|D||Q|) time performance without trading off space performance. Instead, they have better buffering-space performance than state-of-the-art str- eam-querying algorithms. In particular, EQ achieves opti- mal buffering-space performance. Our experimental results show that our algorithms have not only good theoretical complexity but also considerable practical performance ad- vantages over existing algorithms.
机译:在本文中,我们解决了在流XML数据上评估Xpath查询的问题。我们考虑一个称为Univariate Xpath的实用Xpath片段,它包含常用的'/'和'//'轴,并允许进行β节点测试和任意嵌套的谓词。众所周知,在非流环境中,| X |可以在O(| D || Q |)时间内有效地评估此Xpath片段[18]。是文档大小和| Q |是查询大小。但是,在流环境中,这不一定是正确的,因为与非流算法相比,流算法必须满足更严格的要求,因为必须逐次读取所有数据。因此,最新的流查询算法具有比O(| D || Q |)高的时间复杂度也就不足为奇了。在本文中,我们回顾了Xpath流查询问题,并表明在流环境中,可以在O(| D || Q |)时间有效评估单变量Xpath。具体来说,我们提出了两种O(| D || Q |)时间流查询算法LQ和EQ,它们分别基于惰性策略和渴望策略。据我们所知,LQ和EQ是最早实现O(| D || Q |)时间性能的Xpath流查询算法。此外,我们的算法在不影响空间性能的情况下实现了O(| D || Q |)时间性能。取而代之的是,它们比最先进的str-query查询算法具有更好的缓冲空间性能。特别是,均衡器实现了最佳的缓冲空间性能。我们的实验结果表明,与现有算法相比,我们的算法不仅具有良好的理论复杂性,而且具有相当大的实际性能优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号