GoFast: Graph-based optimization for efficient and scalable query evaluation

Zouaghi Ishaq; Mesmoudi Amin; Galicia Jorge; Bellatreche Ladjel; Aguili Taoufik

首页> 外文期刊>Information Systems >GoFast: Graph-based optimization for efficient and scalable query evaluation

【24h】

GoFast: Graph-based optimization for efficient and scalable query evaluation

机译：Gofast：基于图的高效和可扩展查询评估的优化

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The popularity of the Resource Description Framework (RDF) and SPARQL has thrust the development of high-performance systems to manage data represented with this model. Former approaches adapted the well-established relational model applying its storage, query processing, and optimization strategies. However, the borrowed techniques from the relational model are not universally applicable in the RDF context. First, the schema-free nature of RDF induces intensive joins overheads. Also, optimization strategies trying to find the optimal join order rely on error-prone statistics unable to capture all the correlations among triples. Graph-based approaches keep the graph structure of RDF representing the data directly as a graph. Their execution model leans on graph exploration operators to find subgraph matches to a query. Even if they have shown to outperform relational-based systems in complex queries, they are barely scalable and optimization techniques are completely system dependent. Recently, some systems such as RDF_QDAG have shown that by combining graph exploration and triples clustering one can achieve a good compromise between performance and scalability. In this paper, we propose optimization strategies for this kind of RDF management systems. First, we define novel statistics collected for clusters of triples to better capture the dependencies found in the original graph. Second, we redefine an execution plan based on these logical structures which allows to represent the RDF graph exploration process. Third, we introduce an algorithm for selecting the optimal execution plan based on a customized cost model. Finally, we propose a new approach to refine the chosen plan by pruning invalid clusters that do not participate in the construction of the final query results. All our proposals are validated experimentally using well-known RDF benchmarks. (C) 2021 Elsevier Ltd. All rights reserved.

机译：资源描述框架（RDF）和SPARQL的普及推动了高性能系统的开发，以管理使用此模型表示的数据。前方法适用于应用其存储，查询处理和优化策略的良好关系模型。但是，来自关系模型的借用技术在RDF上下文中并不普遍适用。首先，RDF的无模式性质诱导密集的连接架空。此外，尝试找到最佳连接顺序的优化策略依赖于容易出错的统计信息无法捕获三元之间的所有相关性。基于图形的方法将RDF的图形结构直接视为图形。他们的执行模型倾向于图形探索运算符，以找到对查询的子图匹配。即使它们已在复杂查询中显示基于关系的基于关系的系统，它们也几乎不能扩展，优化技术完全依赖于系统。最近，一些如RDF_QDAG的系统已经表明，通过组合图形探索和三元组聚类，可以在性能和可扩展性之间实现良好的折衷。在本文中，我们提出了这种RDF管理系统的优化策略。首先，我们定义收集的小组统计，以便进行三元组的集群，以更好地捕获原始图中的依赖项。其次，我们根据这些逻辑结构重新定义执行计划，允许表示RDF图探索过程。第三，我们介绍了一种基于定制成本模型选择最佳执行计划的算法。最后，我们提出了一种通过修剪未参与最终查询结果建设的无效群集来完善所选计划的新方法。我们所有的建议都使用着名的RDF基准进行了实验验证。（c）2021 elestvier有限公司保留所有权利。

著录项

来源
《Information Systems 》 |2021年第7期| 101738.1-101738.18| 共18页
作者
Zouaghi Ishaq; Mesmoudi Amin; Galicia Jorge; Bellatreche Ladjel; Aguili Taoufik;
展开▼
作者单位

LIAS ISAE ENSMA Chasseneuil France|LR SysCom ENIT UTM Tunis Tunisia;

Univ Poitiers LIAS Poitiers France;

LIAS ISAE ENSMA Chasseneuil France;

LIAS ISAE ENSMA Chasseneuil France;

LR SysCom ENIT UTM Tunis Tunisia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Optimization; RDF; SPARQL; Cardinality estimation; Cost model;

机译：优化;RDF;SPARQL;基数估计;成本模型;

相似文献

外文文献
中文文献
专利

1. Optimizing Large Query by Simulated Annealing Algorithm Based On Graph-Based Approach [J] . Yongheng Chen, Wanli Zuo, Fenglin He, Journal of software . 2011 ,第9期

机译：基于图的模拟退火算法优化大型查询
2. Optimizing Large Query by Simulated Annealing Algorithm Based On Graph-Based Approach [J] . Yongheng Chen, Wanli Zuo, Fenglin He, Journal of Computers . 2011 ,第9期

机译：基于图的模拟退火算法优化大型查询
3. Optimizing Large Query by Simulated Annealing Algorithm Based On Graph-Based Approach [J] . Yongheng Chen, Wanli Zuo, Fenglin He, Journal of software . 2011 ,第9期

机译：基于图的模拟退火算法优化大型查询
4. Scalable Query Optimization for Efficient Data Processing Using MapReduce [C] . Yi Shan, Yi Chen 2015 IEEE International Congress on Big Data . 2015

机译：使用MapReduce进行高效数据处理的可扩展查询优化
5. Efficient search in extensible database query optimization: The Volcano Optimizer Generator [D] . McKenna, William Joseph 1993

机译：可扩展数据库查询优化中的高效搜索：Volcano Optimizer Generator
6. StreamQRE: Modular Specification and Efficient Evaluation of Quantitative Queries over Streaming Data [O] . Konstantinos Mamouras, Mukund Raghothaman, Rajeev Alur, -1

机译：StreamQRE：流数据上的定量查询的模块化规范和有效评估
7. ARGUS: Efficient Scalable Continuous Query Optimization for Large-Volume Data Streams [O] . Jin, Chun, Carbonell, Jaime G. 2006

机译：ARGUS：大容量数据流的高效可扩展连续查询优化

GoFast: Graph-based optimization for efficient and scalable query evaluation

摘要

著录项

相似文献

相关主题

期刊订阅