...
首页> 外文期刊>ACM Transactions on Storage >Storing Semi-Structured Data on Disk Drives
【24h】

Storing Semi-Structured Data on Disk Drives

机译:在磁盘驱动器上存储半结构化数据

获取原文
获取原文并翻译 | 示例
           

摘要

Applications that manage semi-structured data are becoming increasingly commonplace. Current approaches for storing semi-structured data use existing storage machinery; they either map the data to relational databases, or use a combination of flat files and indexes. While employing these existing storage mechanisms provides readily available solutions, there is a need to more closely examine their suitability to this class of data. Particularly, retrofitting existing solutions for semi-structured data can result in a mismatch between the tree structure of the data and the access characteristics of the underlying storage device (disk drive). This study explores various possibilities in the design space of native storage solutions for semi-structured data by exploring alternative approaches that match application data access characteristics to those of the underlying disk drive. For evaluating the effectiveness of the proposed native techniques in relation to the existing solution, we experiment with XML data using the XPathMark benchmark. Extensive evaluation reveals the strengths and weaknesses of the proposed native data layout techniques. While the existing solutions work really well for deep-focused queries into a semi-structured document (those that result in retrieving entire subtrees), the proposed native solutions substantially outperform for the non-deep-focused queries, which we demonstrate are at least as important as the deep-focused. We believe that native data layout techniques offer a unique direction for improving the performance of semi-structured data stores for a variety of important workloads. However, given that the proposed native techniques require circumventing current storage stack abstractions, further investigation is warranted before they can be applied to general-purpose storage systems.
机译:管理半结构化数据的应用程序变得越来越普遍。当前用于存储半结构化数据的方法是使用现有的存储设备。他们要么将数据映射到关系数据库,要么使用平面文件和索引的组合。尽管采用这些现有存储机制可提供易于使用的解决方案,但需要更仔细地检查它们对此类数据的适用性。特别是,对现有的半结构化数据解决方案进行改造可能会导致数据的树形结构与基础存储设备(磁盘驱动器)的访问特征之间的不匹配。这项研究通过探索将应用程序数据访问特性与底层磁盘驱动器的特性相匹配的替代方法,探索了半结构化数据本机存储解决方案设计空间中的各种可能性。为了评估与现有解决方案相关的本机技术的有效性,我们使用XPathMark基准测试XML数据。广泛的评估揭示了所提出的本机数据布局技术的优缺点。尽管现有的解决方案对于深度关注半结构化文档的查询非常有效(那些结果导致检索整个子树),但对于非深度关注的查询,建议的本机解决方案的性能明显优于我们所展示的至少重要的是专注。我们认为,本机数据布局技术为提高各种重要工作负载的半结构化数据存储的性能提供了一个独特的方向。但是,由于建议的本机技术需要规避当前的存储堆栈抽象,因此有必要进行进一步研究,然后才能将其应用于通用存储系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号