Storing Semi-Structured Data on Disk Drives

MEDHA BHADKAMKAR; FERNANDO FARFAN; VAGELIS HRISTIDIS; RAJU RANGASWAMI

首页> 外文期刊>ACM Transactions on Storage >Storing Semi-Structured Data on Disk Drives

【24h】

Storing Semi-Structured Data on Disk Drives

机译：在磁盘驱动器上存储半结构化数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Applications that manage semi-structured data are becoming increasingly commonplace. Current approaches for storing semi-structured data use existing storage machinery; they either map the data to relational databases, or use a combination of flat files and indexes. While employing these existing storage mechanisms provides readily available solutions, there is a need to more closely examine their suitability to this class of data. Particularly, retrofitting existing solutions for semi-structured data can result in a mismatch between the tree structure of the data and the access characteristics of the underlying storage device (disk drive). This study explores various possibilities in the design space of native storage solutions for semi-structured data by exploring alternative approaches that match application data access characteristics to those of the underlying disk drive. For evaluating the effectiveness of the proposed native techniques in relation to the existing solution, we experiment with XML data using the XPathMark benchmark. Extensive evaluation reveals the strengths and weaknesses of the proposed native data layout techniques. While the existing solutions work really well for deep-focused queries into a semi-structured document (those that result in retrieving entire subtrees), the proposed native solutions substantially outperform for the non-deep-focused queries, which we demonstrate are at least as important as the deep-focused. We believe that native data layout techniques offer a unique direction for improving the performance of semi-structured data stores for a variety of important workloads. However, given that the proposed native techniques require circumventing current storage stack abstractions, further investigation is warranted before they can be applied to general-purpose storage systems.

机译：管理半结构化数据的应用程序变得越来越普遍。当前用于存储半结构化数据的方法是使用现有的存储设备。他们要么将数据映射到关系数据库，要么使用平面文件和索引的组合。尽管采用这些现有存储机制可提供易于使用的解决方案，但需要更仔细地检查它们对此类数据的适用性。特别是，对现有的半结构化数据解决方案进行改造可能会导致数据的树形结构与基础存储设备（磁盘驱动器）的访问特征之间的不匹配。这项研究通过探索将应用程序数据访问特性与底层磁盘驱动器的特性相匹配的替代方法，探索了半结构化数据本机存储解决方案设计空间中的各种可能性。为了评估与现有解决方案相关的本机技术的有效性，我们使用XPathMark基准测试XML数据。广泛的评估揭示了所提出的本机数据布局技术的优缺点。尽管现有的解决方案对于深度关注半结构化文档的查询非常有效（那些结果导致检索整个子树），但对于非深度关注的查询，建议的本机解决方案的性能明显优于我们所展示的至少重要的是专注。我们认为，本机数据布局技术为提高各种重要工作负载的半结构化数据存储的性能提供了一个独特的方向。但是，由于建议的本机技术需要规避当前的存储堆栈抽象，因此有必要进行进一步研究，然后才能将其应用于通用存储系统。

著录项

来源
《ACM Transactions on Storage》 |2009年第2期|共35页
作者
MEDHA BHADKAMKAR; FERNANDO FARFAN; VAGELIS HRISTIDIS; RAJU RANGASWAMI;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类存贮器;
关键词
Architecture; Design; Algorithms; Performance; Semi-structured data; Storage management; XML;

机译：体系结构;设计;算法;性能;半结构化数据;存储管理;XML;

相似文献

外文文献
中文文献
专利

1. Storing Semi-Structured Data on Disk Drives [J] . MEDHA BHADKAMKAR, FERNANDO FARFAN, VAGELIS HRISTIDIS, ACM Transactions on Storage . 2009,第2期

机译：在磁盘驱动器上存储半结构化数据
2. DATA-DRIVEN RELIABILITY FOR DATACENTER HARD DISK DRIVES [J] . Alan Yang, AmirEmad Ghassami, Elyse Rosenbaum, Electronic Device Failure Analysis: A Resource for Technical Information and Industry Developments . 2019,第2期

机译：数据中心硬盘驱动器的数据驱动可靠性
3. PREVENTING DATA-LOSS DISASTER: PROTECT YOUR DIGITAL LIFE FROM FAILURE-PRONE HARD DRIVES BY STORING DATA IN THE CLOUD [J] . SETH PORGES Popular Mechanics . 2012,第10期

机译：预防数据丢失灾难：通过将数据存储在云中来保护您的数字生命，使其免受故障造成的硬盘损坏
4. Multi-dimensional Index over a Key-Value Store for Semi-structured Data [C] . Xin Gao, Yong Qi, Di Hou International Conference on Big Scientific Data Management . 2019

机译：用于半结构数据的键值存储中的多维索引
5. The study and development of automatic data acquisition system for spin -stand imaging and drive independent recovery of hard disk data [D] . Tseng, Chun-Yang 2007

机译：用于自旋支架成像和驱动器独立恢复硬盘数据的自动数据采集系统的研究与开发
6. Use of Serum Stored on Filter Paper Disks in Complement Fixation Tests for Adenovirus Antibody [O] . Earl A. Edwards 1977

机译：滤纸片上存储的血清在腺病毒抗体补体固定测试中的使用
7. Storing Semi-structured Data on Disk Drives 1 [O] . Medha Bhadkamkar, Fernando Farfan, Vagelis Hristidis 2010

机译：在磁盘驱动器上存储半结构化数据1

Storing Semi-Structured Data on Disk Drives

摘要

著录项

相似文献

相关主题

期刊订阅