首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >DiploCloud: Efficient and Scalable Management of RDF Data in the Cloud
【24h】

DiploCloud: Efficient and Scalable Management of RDF Data in the Cloud

机译:DiploCloud:云中RDF数据的高效和可扩展管理

获取原文
获取原文并翻译 | 示例

摘要

Despite recent advances in distributed RDF data management, processing large-amounts of RDF data in the cloud is still very challenging. In spite of its seemingly simple data model, RDF actually encodes rich and complex graphs mixing both instance and schema-level data. Sharding such data using classical techniques or partitioning the graph using traditional min-cut algorithms leads to very inefficient distributed operations and to a high number of joins. In this paper, we describe DiploCloud, an efficient and scalable distributed RDF data management system for the cloud. Contrary to previous approaches, DiploCloud runs a physiological analysis of both instance and schema information prior to partitioning the data. In this paper, we describe the architecture of DiploCloud, its main data structures, as well as the new algorithms we use to partition and distribute data. We also present an extensive evaluation of DiploCloud showing that our system is often two orders of magnitude faster than state-of-the-art systems on standard workloads.
机译:尽管最近在分布式RDF数据管理方面取得了进步,但是在云中处理大量RDF数据仍然非常具有挑战性。尽管RDF看似简单的数据模型,但它实际上对混合实例和模式级数据的丰富而复杂的图进行了编码。使用经典技术对此类数据进行分片或使用传统的最小割算法对图形进行分区会导致分布式操作效率低下,并导致大量联接。在本文中,我们将介绍DiploCloud,这是一种针对云的高效且可扩展的分布式RDF数据管理系统。与以前的方法相反,DiploCloud在对数据进行分区之前会对实例和模式信息进行生理分析。在本文中,我们描述了DiploCloud的体系结构,主要数据结构以及用于分区和分发数据的新算法。我们还对DiploCloud进行了广泛的评估,表明我们的系统通常比标准工作负载上的最新系统快两个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号