首页> 外文期刊>ACM transactions on database systems >Path Sharing and Predicate Evaluation for High-Performance XML Filtering
【24h】

Path Sharing and Predicate Evaluation for High-Performance XML Filtering

机译:高性能XML过滤的路径共享和谓词评估

获取原文
获取原文并翻译 | 示例

摘要

XML filtering systems aim to provide fast, on-the-fly matching of XML-encoded data to large numbers of query specifications containing constraints on both structure and content. It is now well accepted that approaches using event-based parsing and Finite State Machines (FSMs) can provide the basis for highly scalable structure-oriented XML filtering systems. The XFilter system [Altinel and Franklin 2000] was the first published FSM-based XML filtering approach. XFilter used a separate FSM per path query and a novel indexing mechanism to allow all of the FSMs to be executed simultaneously during the processing of a document. Building on the insights of the XFilter work, we describe a new method, called "YFilter" that combines all of the path queries into a single Nondeterministic Finite Automaton (NFA). YFilter exploits commonality among queries by merging common prefixes of the query paths such that they are processed at most once. The resulting shared processing provides tremendous improvements in structure matching performance but complicates the handling of value-based predicates. In this article, we first describe the XFilter and YFilter approaches and present results of a detailed performance comparison of structure matching for these algorithms as well as a hybrid approach. The results show that the path sharing employed by YFilter can provide order-of-magnitude performance benefits. We then propose two alternative techniques for extending YFilter's shared structure matching with support for value-based predicates, and compare the performance of these two techniques. The results of this latter study demonstrate some key differences between shared XML filtering and traditional database query processing. Finally, we describe how the YFilter approach is extended to handle more complicated queries containing nested path expressions.
机译:XML过滤系统旨在提供XML编码数据与包含结构和内容约束的大量查询规范的快速,即时匹配。现在已被广泛接受,使用基于事件的解析和有限状态机(FSM)的方法可以为高度可扩展的面向结构的XML过滤系统提供基础。 XFilter系统[Altinel和Franklin 2000]是第一个已发布的基于FSM的XML过滤方法。 XFilter对每个路径查询使用了单独的FSM,并使用了新颖的索引机制来允许所有FSM在文档处理期间同时执行。基于XFilter工作的见识,我们描述了一种称为“ YFilter”的新方法,该方法将所有路径查询组合到一个非确定性有限自动机(NFA)中。 YFilter通过合并查询路径的公共前缀来利用查询之间的公共性,这样最多可以处理一次。最终的共享处理在结构匹配性能上提供了极大的改进,但是使基于值的谓词的处理变得复杂。在本文中,我们首先描述XFilter和YFilter方法,并给出这些算法的结构匹配以及混合方法的详细性能比较结果。结果表明,YFilter所采用的路径共享可以提供数量级的性能优势。然后,我们提出了两种替代技术来扩展YFilter的共享结构匹配,并支持基于值的谓词,并比较这两种技术的性能。后一项研究的结果证明了共享XML过滤与传统数据库查询处理之间的一些关键区别。最后,我们描述如何扩展YFilter方法以处理包含嵌套路径表达式的更复杂的查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号