首页> 外文期刊>Information Systems >Schema-independent querying for heterogeneous collections in NoSQL document stores
【24h】

Schema-independent querying for heterogeneous collections in NoSQL document stores

机译:与模式无关的NoSQL文档存储中的异构集合查询

获取原文
获取原文并翻译 | 示例
           

摘要

NoSQL document stores are well-tailored to efficiently load and manage massive collections of heterogeneous documents without any prior structural validation. However, this flexibility becomes a serious challenge when querying heterogeneous documents, and hence the user has to build complex queries or reformulate existing queries whenever new schemas are introduced in a collection. In this paper we propose a novel approach, based on formal foundations, for building schema-independent queries which are designed to query multi-structured documents. We present a query enrichment mechanism that consults a pre-constructed dictionary. This dictionary binds each possible path in the documents to all its corresponding absolute paths in all the documents. We automate the process of query reformulation via a set of rules that reformulate most document store operators, such as select, project, unnest, aggregate and lookup. We then produce queries across multi-structured documents which are compatible with the native query engine of the underlying document store. To evaluate our approach, we conducted experiments on synthetic datasets. Our results show that the induced overhead can be acceptable when compared to the efforts needed to restructure the data or the time required to execute several queries corresponding to the different schemas inside the collection. (C) 2019 Elsevier Ltd. All rights reserved.
机译:NoSQL文档存储量身定制,可有效加载和管理大量的异构文档集合,而无需任何事先的结构验证。但是,这种灵活性在查询异构文档时成为严峻的挑战,因此,每当在集合中引入新模式时,用户就必须构建复杂的查询或重新格式化现有的查询。在本文中,我们提出了一种基于形式基础的新颖方法,用于构建与模式无关的查询,该查询旨在查询多结构文档。我们提出了查询预编译字典的查询丰富机制。该字典将文档中的每个可能路径绑定到所有文档中所有其对应的绝对路径。我们通过一系列重新格式化大多数文档存储操作符的规则(例如选择,项目,嵌套,聚合和查找)来自动化查询重新格式化的过程。然后,我们跨多个结构化文档生成查询,这些查询与基础文档存储的本机查询引擎兼容。为了评估我们的方法,我们对合成数据集进行了实验。我们的结果表明,与重组数据所需的工作量或执行与集合内部不同架构相对应的几个查询所需的时间相比,诱发的开销可以接受。 (C)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号