...
首页> 外文期刊>The VLDB journal >A cost model for random access queries in document stores
【24h】

A cost model for random access queries in document stores

机译:文档存储中随机访问查询的成本模型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Document stores have become one of the key NoSQL storage solutions. They have been widely adopted in different domains due to their ability to store semi-structured data and expressive query capabilities. However, implementations differ in terms of concrete data storage and retrieval. Unfortunately, a standard framework for data and query optimization for document stores is nonexistent, and only implementation-specific design and query guidelines are used. Hence, the goal of this work is to aid automating the data design for document stores based on query costs instead of generic design rules. For this, we define a generic storage and query cost model based on disk access and memory allocation that allows estimating the impact of design decisions. Since all document stores carry out data operations in memory, we first estimate the memory usage by considering characteristics of the stored documents, their access patterns, and memory management algorithms. Then, using this estimation and metadata storage size, we introduce a cost model for random access queries. We validate our work on two well-known document store implementations: MongoDB and Couchbase. The results show that the memory usage estimates have the average precision of 91% and predicted costs are highly correlated to the actual execution times. During this work, we have managed to suggest several improvements to document storage systems. Thus, this cost model also contributes to identifying discordance between document store implementations and their theoretical expectations.
机译:文档存储已成为关键的NoSQL存储解决方案之一。由于它们可以存储半结构化数据和富有表现力的查询功能,因此它们已被广泛采用不同的域中采用。然而,实现在具体数据存储和检索方面不同。遗憾的是,文档存储的数据和查询优化的标准框架不存在,并且仅使用特定于实现的设计和查询指南。因此,这项工作的目标是帮助根据查询成本而不是通用设计规则自动化文档存储的数据设计。为此,我们根据磁盘访问和内存分配定义了一种通用存储和查询成本模型,允许估计设计决策的影响。由于所有文档存储在内存中进行数据操作,我们首先考虑所存储的文档,访问模式和内存管理算法的特征来估计内存使用。然后,使用此估计和元数据存储大小,我们介绍了随机访问查询的成本模型。我们验证了我们在两个着名的文档商店实现上的工作:MongoDB和Couchbase。结果表明,内存使用估计的平均精度为91%,预测成本与实际执行时间高度相关。在这项工作期间,我们已经设法建议对文档存储系统的几个改进。因此,该成本模型也有助于识别文档商店实现之间的义务及其理论期望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号