Cluster-Based Join for Geographically Distributed Big RDF Data

机译：基于群集的基于地理分布的大型RDF数据的连接

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Federated RDF systems allow users to retrieve data from multiple independent sources without needing to have all the data in the same triple store. The performance of these systems can be poor for large and geographically distributed RDF data where network transfer costs are high. This paper introduces CBTP, a novel join algorithm that takes advantage of network topology to decrease the cost of processing SPARQL queries in a geographically distributed environment. Federation members are grouped in clusters, based on the network communication cost between the members, and the bulk of the join processing is pushed to the clusters. We use an overlap list to efficiently compute join results from triples in different clusters. We implement our algorithms in OpenRDF Sesame federated framework and use Apache Rya triple store instances as federation members. Experimental evaluation results show the advantages of our approach over existing techniques.

机译：联邦的RDF系统允许用户从多个独立源检索数据，而无需在同一三重存储中拥有所有数据。对于网络传输成本高的大型和地理分布的RDF数据，这些系统的性能可能很差。本文介绍了一种新型加入算法，利用网络拓扑，从而降低了在地理上分布式环境中处理SPARQL查询的成本。联合成员基于构件之间的网络通信成本，并将大部分连接处理被推到集群中，将联合成员分组。我们使用重叠列表以有效地计算不同群集中的三倍的连接结果。我们在OpenRDF Sesame联合框架中实现了我们的算法，并使用Apache Rya Triple Store实例作为联合成员。实验评估结果表明我们对现有技术的方法的优点。

著录项

来源
《IEEE International Congress on Big Data》|2019年|1 v.|共9页
会议地点
作者
Fan Yang; Adina Crainiceanu; Zhiyuan Chen; Don Needham;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
Big Data; pattern clustering; query languages; query processing; semantic Web; topology;

机译：大数据;模式聚类;查询语言;查询处理;语义Web;拓扑;

相似文献

外文文献
中文文献
专利

1. Distributed Join Query Processing for Big RDF Data [J] . Advanced Science Letters . 2018,第10期

机译：用于大RDF数据的分布式加入查询处理
2. Distributed Top-K Join Queries Optimizing for RDF Datasets [J] . Gu Jinguang, Dong Hao, Liu Zhao, International journal of web services research . 2017,第3期

机译：针对RDF数据集的分布式Top-K连接查询优化
3. Adaptive mechanism for distributed query processing and data loading using the RDF data in the cloud [J] . Dharmaraj Chandrasekaran Ranichandra, Tripathy BalaKrushna International journal of communication systems . 2018,第15期

机译：使用云中的RDF数据进行分布式查询处理和数据加载的自适应机制
4. Cluster-Based Join for Geographically Distributed Big RDF Data [C] . Fan Yang, Adina Crainiceanu, Zhiyuan Chen, IEEE International Congress on Big Data . 2019

机译：基于集群的联接，用于地理分布的大RDF数据
5. Distributed RDF query processing and reasoning for Big Data Linked Data. [D] . Perasani, Anudeep. 2014

机译：大数据链接数据的分布式RDF查询处理和推理。
6. SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases [O] . Hirokazu Chiba, Ikuo Uchiyama 2017

机译：SPANG：SPARQL客户端支持生成和重用分布式RDF数据库的查询
7. Towards Load Balancing and Parallelizing of RDF Query Processing in P2P Based Distributed RDF Data Stores [O] . Liaquat Ali, Thomas Janson, Christian Schindelhauer 2015

机译：基于p2p的分布式RDF数据存储中RDF查询处理的负载均衡与并行化

Cluster-Based Join for Geographically Distributed Big RDF Data

摘要

著录项

相似文献

相关主题

期刊订阅