Path Sharing and Predicate Evaluation for High-Performance XML Filtering

YANLEI DIAO; MEHMET ALTINEL; MICHAEL J. FRANKLIN; HAO ZHANG; PETER FISCHER

首页> 外文期刊>ACM transactions on database systems >Path Sharing and Predicate Evaluation for High-Performance XML Filtering

【24h】

Path Sharing and Predicate Evaluation for High-Performance XML Filtering

机译：高性能XML过滤的路径共享和谓词评估

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

XML filtering systems aim to provide fast, on-the-fly matching of XML-encoded data to large numbers of query specifications containing constraints on both structure and content. It is now well accepted that approaches using event-based parsing and Finite State Machines (FSMs) can provide the basis for highly scalable structure-oriented XML filtering systems. The XFilter system [Altinel and Franklin 2000] was the first published FSM-based XML filtering approach. XFilter used a separate FSM per path query and a novel indexing mechanism to allow all of the FSMs to be executed simultaneously during the processing of a document. Building on the insights of the XFilter work, we describe a new method, called "YFilter" that combines all of the path queries into a single Nondeterministic Finite Automaton (NFA). YFilter exploits commonality among queries by merging common prefixes of the query paths such that they are processed at most once. The resulting shared processing provides tremendous improvements in structure matching performance but complicates the handling of value-based predicates. In this article, we first describe the XFilter and YFilter approaches and present results of a detailed performance comparison of structure matching for these algorithms as well as a hybrid approach. The results show that the path sharing employed by YFilter can provide order-of-magnitude performance benefits. We then propose two alternative techniques for extending YFilter's shared structure matching with support for value-based predicates, and compare the performance of these two techniques. The results of this latter study demonstrate some key differences between shared XML filtering and traditional database query processing. Finally, we describe how the YFilter approach is extended to handle more complicated queries containing nested path expressions.

机译：XML过滤系统旨在提供XML编码数据与包含结构和内容约束的大量查询规范的快速，即时匹配。现在已被广泛接受，使用基于事件的解析和有限状态机（FSM）的方法可以为高度可扩展的面向结构的XML过滤系统提供基础。 XFilter系统[Altinel和Franklin 2000]是第一个已发布的基于FSM的XML过滤方法。 XFilter对每个路径查询使用了单独的FSM，并使用了新颖的索引机制来允许所有FSM在文档处理期间同时执行。基于XFilter工作的见识，我们描述了一种称为“ YFilter”的新方法，该方法将所有路径查询组合到一个非确定性有限自动机（NFA）中。 YFilter通过合并查询路径的公共前缀来利用查询之间的公共性，这样最多可以处理一次。最终的共享处理在结构匹配性能上提供了极大的改进，但是使基于值的谓词的处理变得复杂。在本文中，我们首先描述XFilter和YFilter方法，并给出这些算法的结构匹配以及混合方法的详细性能比较结果。结果表明，YFilter所采用的路径共享可以提供数量级的性能优势。然后，我们提出了两种替代技术来扩展YFilter的共享结构匹配，并支持基于值的谓词，并比较这两种技术的性能。后一项研究的结果证明了共享XML过滤与传统数据库查询处理之间的一些关键区别。最后，我们描述如何扩展YFilter方法以处理包含嵌套路径表达式的更复杂的查询。

著录项

来源
《ACM transactions on database systems》 |2003年第4期|p.467-516|共50页
作者
YANLEI DIAO; MEHMET ALTINEL; MICHAEL J. FRANKLIN; HAO ZHANG; PETER FISCHER;
展开▼
作者单位

Department of EECS University of California, Berkeley, 421 Soda Hall, Berkeley, CA 94720;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
XML filtering; structure matching; path sharing; nondeter-ministic finite automaton; content-based matching; predicate evaluation; nested path expressions;

机译：XML过滤;结构匹配;路径共享;非确定性有限自动机;基于内容的匹配;谓词评估;嵌套路径表达式;

相似文献

外文文献
中文文献
专利

1. Value-based Predicate Filtering Of Xml Documents [J] . Joonho Kwon, Praveen Rao, Bongki Moon, Data & Knowledge Engineering . 2008,第1期

机译：基于价值的XML文档谓词过滤
2. Delay-shared N-path structures for video-rate SC FIR filters [J] . Cortelazzo G.M., Malavasi E. IEEE Transactions on Circuits and Systems for Video Technology . 1995,第2期

机译：视频速率SC FIR滤波器的延迟共享N路径结构
3. Delay-shared N-path structures for video-rate SC FIR filters [J] . Cortelazzo G.M., Malavasi E., Segato M., IEEE Transactions on Circuits and Systems for Video Technology . 1995,第2期

机译：视频速率SC FIR滤波器的延迟共享N路径结构
4. PathStack~-: A Holistic Path Join Algorithm for Path Query with Not-Predicates on XML Data [C] . Enhua Jiao, Tok Wang Ling, Chee-Yong Chan International Conference on Database Systems for Advanced Applications . 2005

机译：pathstack〜 - ：用于XML数据上没有谓词的路径查询的整体路径连接算法
5. An Evaluation of the Cumulative Impact of Multiple Priorities and Strong Organizational Culture on Shared Vision and Cross-Functional Alignment in a High-Performance Organization [D] . Campbell, Edgar E. 2011

机译：评估高性能组织中的多个优先事项和强大的组织文化对共同愿景和跨职能联盟的累积影响
6. Argument-predicate distance as a filter for enhancing precision in extracting predications on the genetic etiology of disease [O] . Marco Masseroli, Halil Kilicoglu, François-Michel Lang, 2006

机译：参数-谓词距离作为过滤器可提高提取疾病遗传病因的预测的精度
7. Path Sharing and Predicate Evaluation for High-Performance XML Filtering [O] . Yanlei Diao, Michael J. Franklin, Hao Zhang, 2014

机译：高性能XmL过滤的路径共享和谓词评估

Path Sharing and Predicate Evaluation for High-Performance XML Filtering

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅