RDF Data Storage Techniques for Efficient SPARQL Query Processing Using Distributed Computation Engines

机译：使用分布式计算引擎进行高效SPARQL查询处理的RDF数据存储技术

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The rapidly growing amount of linked open data demands semantic RDF services that are efficient, scalable, and distributed along with high availability and fault tolerance. To address this concern, the Big Data processing infrastructure Hadoop has been adopted for RDF data management systems. In this paper, we introduce distributed RDF data stores, namely VPExp and 3CStore, based on the existing vertical partitioning (VP) approach. In the VPExp approach, we propose splitting of predicates based on explicit type information of an object. The 3CStore scheme is designed with a 3-column store, comprising of a subset of triples from the VP table based on different join correlations, to reduce the number of join operations while executing SPARQL queries as SQL in a distributed system. We evaluate these two RDF data storage approaches by comparing them with vertical partitioning approach and state-of-the-art RDF management system S2RDF. We also present an evaluation of query performance of these systems built upon two popular distributed computation engines namely, Spark and Drill.

机译：链接开放数据的数量迅速增长，需要高效，可扩展和分布式的语义RDF服务，以及高可用性和容错能力。为了解决此问题，RDF数据管理系统采用了大数据处理基础架构Hadoop。在本文中，我们基于现有的垂直分区（VP）方法介绍分布式RDF数据存储，即VPExp和3CStore。在VPExp方法中，我们建议根据对象的显式类型信息进行谓词拆分。 3CStore方案设计为具有3列存储，该存储由基于不同联接相关性的VP表中三元组的子集组成，以减少在分布式系统中以SQL形式执行SPARQL查询时的联接操作数。通过将它们与垂直分区方法和最新的RDF管理系统S2RDF进行比较，我们评估了这两种RDF数据存储方法。我们还介绍了基于两个流行的分布式计算引擎Spark和Drill构建的这些系统的查询性能评估。

著录项

来源
《2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science》|2018年|323-330|共8页
会议地点 Salt Lake City(US)
作者
Mahmudul Hassan; Srividya K. Bansal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Resource description framework; Sparks; Engines; Distributed databases; Correlation; Query processing; Semantics;

机译：资源描述框架；火花；引擎；分布式数据库；关联；查询处理；语义；;

相似文献

外文文献
中文文献
专利

1. Towards efficient SPARQL query processing on RDF data [J] . Liu Chang, Wang Haofen, Yu Yong, Tsinghua Science and Technology . 2010,第6期

机译：寻求对RDF数据的高效SPARQL查询处理
2. Processing SPARQL queries over distributed RDF graphs [J] . Peng Peng, Zou Lei, Ozsu M. Tamer, The VLDB journal . 2016,第2期

机译：通过分布式RDF图处理SPARQL查询
3. R3F: RDF triple filtering method for efficient SPARQL query processing [J] . Kim Kisung, Moon Bongki, Kim Hyoung-Joo World Wide Web . 2015,第2期

机译：R3F：用于高效SPARQL查询处理的RDF三重过滤方法
4. RDF Data Storage Techniques for Efficient SPARQL Query Processing Using Distributed Computation Engines [C] . Mahmudul Hassan, Srividya K. Bansal IEEE International Conference on Information Reuse and Integration . 2018

机译：RDF数据存储技术，用于使用分布式计算引擎的高效SPARQL查询处理
5. Distributed RDF Storage and Querying Using In-Memory Processing Engine [D] . Hassan, P. M. Mahmudul. 2021

机译：使用内存处理引擎分布式RDF存储和查询
6. SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases [O] . Hirokazu Chiba, Ikuo Uchiyama 2017

机译：SPANG：SPARQL客户端支持生成和重用分布式RDF数据库的查询
7. Research on Efficient SPARQL Query Processing for RDF Data [O] . Yi Zhang 2015

机译：RDF数据有效的SPARQL查询处理研究

RDF Data Storage Techniques for Efficient SPARQL Query Processing Using Distributed Computation Engines

摘要

著录项

相似文献

相关主题

期刊订阅