首页> 外文OA文献 >An Adaptive SPARQL Engine with Dynamic Partitioning for Distributed RDF Repositories
【2h】

An Adaptive SPARQL Engine with Dynamic Partitioning for Distributed RDF Repositories

机译:带有分布式分区RDF存储库的动态分区的自适应SPARQL引擎

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The tremendous increase in the semantic data is driving the demand for efficient query engines. RDF data being generated at an unprecedented rate introduces a storage, indexing, and querying challenge. Due to the size of the data and the federated nature of the semantic web, it is in many cases impractical to assume a central repository, and more attention is being given to distributed RDF stores. This work is motivated by two major drawbacks of current solutions: 1) pre-processing part is very expensive and takes prohibitively long time for large datasets, and 2) current distributed systems assume that a static partitioning of the data should perform well for all kinds of queries, and do not consider fluctuations in the queryload.In this paper we propose PHD-Store, an in-memory SPARQL engine for distributed RDF repositories. Our system does not assume any particular initial placement of the data and does not require pre-processing before running the first query. It analyzes incoming queries and adjusts data placement dynamically in such a way that communication among repositories is minimized for future queries. To achieve this flexibility, frequent query patterns are detected, and data are redistributed through a Propagating Hash Distribution (PHD) algorithm to ensure optimal placement for frequent query patterns. Our experiments with large RDF graphs verify that PHD-Store scales well and executes complex queries more efficiently than existing systems.
机译:语义数据的巨大增长推动了对高效查询引擎的需求。以前所未有的速度生成的RDF数据带来了存储,索引和查询方面的挑战。由于数据的大小和语义Web的联合性质,在许多情况下,假设使用中央存储库是不切实际的,并且越来越关注分布式RDF存储。这项工作是由当前解决方案的两个主要缺点引起的:1)预处理部分非常昂贵,并且对于大型数据集而言花费了非常长的时间,并且2)当前的分布式系统假定静态分区的数据应该对所有类型都表现良好,而不考虑查询负载的波动。本文提出了PHD-Store,这是一种用于分布式RDF存储库的内存中SPARQL引擎。我们的系统不假设数据有任何特定的初始放置,并且在运行第一个查询之前不需要进行预处理。它分析传入的查询并动态调整数据放置,以使存储库之间的通信最小化以用于将来的查询。为了实现这种灵活性,将检测到频繁查询的模式,并通过传播哈希散列(PHD)算法重新分配数据,以确保针对频繁查询的模式进行最佳放置。我们使用大型RDF图进行的实验证明,PHD-Store可以很好地扩展,并且比现有系统更有效地执行复杂的查询。

著录项

  • 作者

    Ibrahim Yasser E.;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号