首页> 外文会议>IEEE International Congress on Big Data >Cluster-Based Join for Geographically Distributed Big RDF Data
【24h】

Cluster-Based Join for Geographically Distributed Big RDF Data

机译:基于群集的基于地理分布的大型RDF数据的连接

获取原文

摘要

Federated RDF systems allow users to retrieve data from multiple independent sources without needing to have all the data in the same triple store. The performance of these systems can be poor for large and geographically distributed RDF data where network transfer costs are high. This paper introduces CBTP, a novel join algorithm that takes advantage of network topology to decrease the cost of processing SPARQL queries in a geographically distributed environment. Federation members are grouped in clusters, based on the network communication cost between the members, and the bulk of the join processing is pushed to the clusters. We use an overlap list to efficiently compute join results from triples in different clusters. We implement our algorithms in OpenRDF Sesame federated framework and use Apache Rya triple store instances as federation members. Experimental evaluation results show the advantages of our approach over existing techniques.
机译:联邦的RDF系统允许用户从多个独立源检索数据,而无需在同一三重存储中拥有所有数据。对于网络传输成本高的大型和地理分布的RDF数据,这些系统的性能可能很差。本文介绍了一种新型加入算法,利用网络拓扑,从而降低了在地理上分布式环境中处理SPARQL查询的成本。联合成员基于构件之间的网络通信成本,并将大部分连接处理被推到集群中,将联合成员分组。我们使用重叠列表以有效地计算不同群集中的三倍的连接结果。我们在OpenRDF Sesame联合框架中实现了我们的算法,并使用Apache Rya Triple Store实例作为联合成员。实验评估结果表明我们对现有技术的方法的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号