首页> 外文期刊>International Journal of Information Management >Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL, RDF and relational data stores
【24h】

Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL, RDF and relational data stores

机译:高效查询聚集体的多维RDF数据:比较NoSQL,RDF和关系数据存储

获取原文
获取原文并翻译 | 示例
       

摘要

This paper proposes an approach to tackle the problem of querying large volume of statistical RDF data. Our approach relies on pre-aggregation strategies to better manage the analysis of this kind of data. Specifically, we define a conceptual model to represent original RDF data with aggregates in a multidimensional structure. A set of translations rules for converting a well-known multidimensional RDF modelling vocabulary into the proposed conceptual model is then proposed. We implement the conceptual model in six different data stores: two RDF triple stores (Jena TDB and Virtuoso), one graph-oriented NoSQL database (Neo4j), one column-oriented data store (Cassandra), and two relational databases (MySQL and PostGreSQL). We compare the querying performance, with and without aggregates, in these data stores. Experimental results, on real-world datasets containing 81.92 million triplets, show that pre-aggregation allows for reducing query runtime in all data stores. Neo4j NoSQL and relational databases with aggregates outperform triple stores speeding up to 99% query runtime.
机译:本文提出了一种解决调查大量统计RDF数据问题的方法。我们的方法依赖于汇总策略来更好地管理对这种数据的分析。具体地,我们定义了一个概念模型,以表示具有多维结构中的聚合的原始RDF数据。然后提出了一组用于将众所周知的多维RDF建模词汇转换为所提出的概念模型的翻译规则。我们在六个不同的数据存储中实现概念模型:两个RDF三重商店(Jena TDB和Virtuoso),一个面向图形的NoSQL数据库(NEO4J),一个面向列的数据存储(Cassandra)和两个关系数据库(MySQL和PostgreSQL )。我们在这些数据存储中比较查询性能,在没有聚合的情况下进行查询性能。实验结果,在包含81.92万三胞胎的实际数据集上,表明预聚合允许在所有数据存储中缩短查询运行时。 Neo4j NoSQL和具有聚合的关系数据库优先表达三倍的存储超速高达99%的查询运行时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号