首页> 外文期刊>Computer science >Storing and indexing XML documents upside down
【24h】

Storing and indexing XML documents upside down

机译:颠倒地存储和索引XML文档

获取原文
获取原文并翻译 | 示例
       

摘要

XML documents contain substantial redundancy in their structure part, because each path from the root node to a leaf node is explicitly represented and typically large sets of such path instances belong to a path class, i.e., the nodes of the path instances are labeled by the same sequence of element (or attribute) names. To save storage space and I/O cost, we want to get rid of this structural redundancy to the extent possible. While all known methods for the physical representation (storage) of XML documents proceed from the root via the element/attribute hierarchy (internal nodes) down to the leaves (values), we follow an upside-down approach which explicitly stores the values and only reconstructs the internal nodes, if needed. The cornerstones for such a solution are suitable node labels and a path synopsis which efficiently represents all path classes of an XML document. As a solution, we propose a compact internal storage format for native XML database systems where the inner structure of the stored documents is virtualized. Because this elementless storage format provides an efficient reconstruction of a document using its path synopsis, all processing properties are preserved and the semantics of navigational and declarative operations of XML languages remains unchanged. Adjusted indexes support the full spectrum of so-called content-and-structure single path queries.rnApart from greatly reduced storage consumption, our approach demonstrates its superiority, compared to competing methods, not only for a substantial fraction of those queries, but also for storing, reconstructing, and navigating XML documents.
机译:XML文档在其结构部分包含大量的冗余,因为从根节点到叶节点的每个路径都被明确表示,并且此类路径实例的大集合通常属于路径类,即,路径实例的节点由标记。元素(或属性)名称的顺序相同。为了节省存储空间和I / O成本,我们希望尽可能消除这种结构冗余。尽管所有已知的XML文档物理表示(存储)方法都是从根开始,通过元素/属性层次结构(内部节点)一直到叶(值),但我们采用了一种上下颠倒的方法,该方法显式地存储值,并且仅如果需要,可重建内部节点。这种解决方案的基石是合适的节点标签和有效地表示XML文档的所有路径类别的路径提要。作为解决方案,我们为本机XML数据库系统提出了一种紧凑的内部存储格式,其中已存储文档的内部结构已虚拟化。因为此无元素存储格式使用其路径提要提供了文档的有效重构,所以保留了所有处理属性,并且XML语言的导航和声明性操作的语义保持不变。调整后的索引支持所谓的“内容和结构”单路径查询的所有范围。rn除了大大减少了存储消耗之外,我们的方法还显示出与竞争方法相比的优越性,不仅在那些查询中占了很大一部分,而且对于存储,重建和浏览XML文档。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号