首页> 外文学位 >Efficient and scalable XML data processing using relational database systems.
【24h】

Efficient and scalable XML data processing using relational database systems.

机译:使用关系数据库系统进行有效且可扩展的XML数据处理。

获取原文
获取原文并翻译 | 示例

摘要

The Extensible Markup Language (XML) can serve as a standard format for storing semi-structured data sets. One approach to implementing an efficient and scalable XML database is to use a relational database system to store and query XML data. This approach is very attractive because the relational database system is a mature technology with proven reliability and scalability. However, this approach also has several disadvantages. Storing and accessing XML data through an SQL interface (therefore, the whole relational database call stack) incurs overhead that is not necessary for XML processing. The relational schema mapped from XML schema may be inefficient in navigating between XML elements through the parent-children or siblings relationship. Also, some XQuery features are hard to translate into SQL or the resulting SQL is complex and inefficient.; This thesis addresses these issues when an XML database is implemented using a relational database system. We first compared the performance of storing XML data in a relational database against several other XML storage strategies that use a file system or an object manager. Our results dearly indicate that when the XML schema information is available, using relational database to store XML data is indeed a viable approach. Then we identify a number of XQuery features that are either hard to translate into SQL or the resulting SQL is complex and inefficient. We propose an extension to the relational database system that facilitates efficient XML query processing within the existing relational database execution framework. The extension can provide an order of magnitude performance improvement for queries such as long path expressions. In the third part of the thesis, we study how to implement an XML publish/subscribe system using a relational database. Our experiments demonstrated that the system has very good performance and scalability in our experiments, handling millions of subscriptions with moderate amounts of physical memory.
机译:可扩展标记语言(XML)可以用作存储半结构化数据集的标准格式。实现高效且可伸缩的XML数据库的一种方法是使用关系数据库系统来存储和查询XML数据。这种方法非常吸引人,因为关系数据库系统是一种成熟的技术,具有可靠的可靠性和可伸缩性。但是,这种方法也有几个缺点。通过SQL接口(因此,整个关系数据库调用堆栈)存储和访问XML数据会产生XML处理不必要的开销。从XML模式映射的关系模式可能无法通过父子关系或兄弟关系在XML元素之间进行导航。此外,某些XQuery功能很难转换为SQL,或者所产生的SQL复杂且效率低下。当使用关系数据库系统实现XML数据库时,本论文解决了这些问题。我们首先比较了在关系数据库中存储XML数据与使用文件系统或对象管理器的其他几种XML存储策略的性能。我们的结果非常表明,当XML模式信息可用时,使用关系数据库存储XML数据确实是一种可行的方法。然后,我们确定了许多XQuery功能,这些功能要么很难转换为SQL,要么生成的SQL复杂且效率低下。我们建议对关系数据库系统进行扩展,以促进现有关系数据库执行框架中的有效XML查询处理。该扩展可以为诸如长路径表达式之类的查询提供一个数量级的性能改进。在论文的第三部分,我们研究如何使用关系数据库实现XML发布/订阅系统。我们的实验表明,该系统在我们的实验中具有非常好的性能和可伸缩性,可以处理具有中等物理内存量的数百万个订阅。

著录项

  • 作者

    Tian, Feng.;

  • 作者单位

    The University of Wisconsin - Madison.;

  • 授予单位 The University of Wisconsin - Madison.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 109 p.
  • 总页数 109
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:43:28

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号