首页> 外文学位 >High performance XPath evaluation in XML streams.
【24h】

High performance XPath evaluation in XML streams.

机译:XML流中的高性能XPath评估。

获取原文
获取原文并翻译 | 示例

摘要

This thesis presents methods for efficiently evaluating structural queries over tree-structured data streams. A data stream usually consists of a sequence of items that arrive in an order determined by the source. An application that uses such data cannot revisit an earlier item in the stream unless it buffers the item itself. Naive buffering methods are not practical due to the high throughput and indefinite length of data streams. Compared with the flat, relational-like data model for data streams that has received recent attention, processing a tree-structured XML data stream poses additional challenges, since a data item cannot, in general, be interpreted without taking structural information into account.;In this thesis, we focus on the evaluation of XPath queries on streaming XML. As a W3C standard, XPath has become a core XML technology not only as a standalone query language but also as the foundation of XQuery and XSLT. Features such as subqueries and reverse axes make XPath a powerful query language but they also complicate XPath query processing. We present our work on XSQ, a streaming XPath query engine. Our methods are based on a novel segment-based evaluation scheme. XSQ uses very little memory and is able to process unbounded and unsegmented streaming data because it does not build a DOM tree in memory. It also provides high throughput by only processing the relevant portions of the data and low response time by returning results as early as possible. XSQ is the first streaming system to support complex XPath features such as multiple predicates, closure axes, aggregations, reverse axes, and subqueries.;We also describe our work on XPaSS, an XPath-based publish-subscribe system that simultaneously evaluates a large number of XPath queries over XML streams. Unlike other similar systems that filter pre-segmented documents as results, XPaSS returns only the precisely delineated data specified by a user query. It uses a segment-sharing scheme instead of prefix- and suffix-sharing that are commonly used. In our experiments, XPaSS supports up to one million XPath subscriptions using a modest PC-class server, with a throughput comparable to that of the simpler filtering systems.
机译:本文提出了有效评估树结构化数据流上的结构化查询的方法。数据流通常由一系列项目组成,这些项目以源确定的顺序到达。使用此类数据的应用程序不能重新访问流中的较早项目,除非它缓冲了项目本身。由于高吞吐量和数据流的不确定长度,单纯的缓冲方法不切实际。与最近受到关注的用于数据流的扁平的,类似关系的数据模型相比,处理树形XML数据流带来了更多的挑战,因为通常无法在不考虑结构信息的情况下解释数据项。在本文中,我们着重于对流XML的XPath查询的评估。作为W3C标准,XPath不仅已成为独立的查询语言,而且已成为XQuery和XSLT的基础,已成为XML的核心技术。诸如子查询和反向轴之类的功能使XPath成为强大的查询语言,但同时也使XPath查询处理变得复杂。我们将介绍有关XSQ(流XPath查询引擎)的工作。我们的方法基于一种新颖的基于细分的评估方案。 XSQ占用很少的内存,并且能够处理无边界和无分段的流数据,因为它没有在内存中建立DOM树。它也仅通过处理数据的相关部分来提供高吞吐量,并通过尽早返回结果来缩短响应时间。 XSQ是第一个支持复杂XPath功能的流系统,例如多个谓词,闭包轴,聚合,反向轴和子查询。;我们还描述了我们在XPaSS上的工作,XPaSS是基于XPath的发布-订阅系统,可同时评估大量数据XML流上的XPath查询。与其他将预分段文档作为结果过滤的类似系统不同,XPaSS仅返回由用户查询指定的精确描绘的数据。它使用段共享方案,而不是常用的前缀和后缀共享。在我们的实验中,XPaSS使用适中的PC级服务器最多支持一百万个XPath订阅,其吞吐量可与简单过滤系统相媲美。

著录项

  • 作者

    Peng, Feng.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 265 p.
  • 总页数 265
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号