【24h】

RDF-3X: a RISC-style Engine for RDF

机译:RDF-3X:用于RDF的RISC样式的引擎

获取原文

摘要

RDF is a data representation format for schema-free structured information that is gaining momentum in the context of Semantic-Web corpora, life sciences, and also Web 2.0 platforms. The "pay-as-you-go" nature of RDF and the flexible pattern-matching capabilities of its query language SPARQL entail efficiency and scalability challenges for complex queries including long join paths. This paper presents the RDF-3X engine, an implementation of SPARQL that achieves excellent performance by pursuing a RISC-style architecture with a streamlined architecture and carefully designed, puristic data structures and operations. The salient points of RDF-3X are: 1) a generic solution for storing and indexing RDF triples that completely eliminates the need for physical-design tuning, 2) a powerful yet simple query processor that leverages fast merge joins to the largest possible extent, and 3) a query optimizer for choosing optimal join orders using a cost model based on statistical synopses for entire join paths. The performance of RDF-3X, in comparison to the previously best state-of-the-art systems, has been measured on several large-scale datasets with more than 50 million RDF triples and benchmark queries that include pattern matching and long join paths in the underlying data graphs.
机译:RDF是一种用于无模式结构化信息的数据表示格式,在语义Web语料库,生命科学以及Web 2.0平台的背景下,这种格式正日渐流行。 RDF的“即付即用”性质及其查询语言SPARQL的灵活模式匹配功能给包括长联接路径在内的复杂查询带来了效率和可伸缩性方面的挑战。本文介绍了RDF-3X引擎,它是一种SPARQL的实现,它通过追求具有精简架构的RISC风格架构以及经过精心设计的纯粹数据结构和操作而获得了出色的性能。 RDF-3X的要点是:1)用于存储和索引RDF三元组的通用解决方案,完全消除了物理设计调整的需要; 2)功能强大而简单的查询处理器,可最大程度地利用快速合并联接, 3)查询优化器,用于基于整个连接路径的统计概要,使用基于成本模型的成本模型来选择最佳连接顺序。与以前最先进的系统相比,RDF-3X的性能已在具有5,000万个RDF三元组和基准查询(包括模式匹配和长连接路径)的多个大型数据集中进行了测量。基础数据图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号