Dynamic and fast processing of queries on large-scale RDF data

Pingpeng Yuan; Changfeng Xie; Hai Jin; Ling Liu; Guang Yang; Xuanhua Shi

首页> 外文期刊>Knowledge and information systems >Dynamic and fast processing of queries on large-scale RDF data

【24h】

Dynamic and fast processing of queries on large-scale RDF data

机译：动态和快速处理大规模RDF数据的查询

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

As RDF data continue to gain popularity, we witness the fast growing trend of RDF datasets in both the number of RDF repositories and the size of RDF datasets. Many known RDF datasets contain billions of RDF triples (subject, predicate and object). One of the grant challenges for managing these huge RDF data is how to execute RDF queries efficiently. In this paper, we address the query processing problems against the billion triple challenges. We first identify some causes for the problems of existing query optimization schemes, such as large intermediate results, initial query cost estimation errors. Then, we present our block-oriented dynamic query plan generation approach powered with pipelining execution. Our approach consists of two phases. In the first phase, a near-optimal execution plan for queries is chosen by identifying the processing blocks of queries. We group the join patterns sharing a join variable into building blocks of the query plan since executing them first provides opportunities to reduce the size of intermediate results generated. In the second phase, we further optimize the initial pipelining for a given query plan. We employ optimization techniques, such as sideways information passing and semi-join, to further reduce the size of intermediate results, improve the query processing cost estimation and speed up the performance of query execution. Experimental results on several RDF datasets of over a billion triples demonstrate that our approach outperforms existing RDF query engines that rely on dynamic programming based static query processing strategies.

机译：随着RDF数据继续获得普及，我们见证了RDF数据集在RDF存储库数量和RDF数据集大小方面的快速增长趋势。许多已知的RDF数据集包含数十亿个RDF三元组（主题，谓词和对象）。管理这些庞大的RDF数据的挑战之一是如何有效地执行RDF查询。在本文中，我们针对十亿个三重挑战解决了查询处理问题。我们首先确定造成现有查询优化方案问题的一些原因，例如较大的中间结果，初始查询成本估算错误。然后，我们提出了基于流水线执行的面向块的动态查询计划生成方法。我们的方法包括两个阶段。在第一阶段，通过标识查询的处理块来选择查询的最佳执行计划。我们将共享一个联接变量的联接模式分组到查询计划的构建块中，因为首先执行它们会提供机会来减小生成的中间结果的大小。在第二阶段，我们进一步优化了给定查询计划的初始流水线。我们采用横向信息传递和半联接等优化技术，以进一步减小中间结果的大小，提高查询处理成本的估计并加快查询执行的性能。在超过十亿个三元组的几个RDF数据集上的实验结果表明，我们的方法优于依赖于基于动态编程的静态查询处理策略的现有RDF查询引擎。

著录项

来源
《Knowledge and information systems》 |2014年第2期|共24页
作者
Pingpeng Yuan; Changfeng Xie; Hai Jin; Ling Liu; Guang Yang; Xuanhua Shi;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化系统理论;
关键词
Query processing; Plan generation; Query plan graph; Operator;

机译：查询处理;计划生成;查询计划图;操作员;

相似文献

外文文献
中文文献
专利

1. Dynamic and fast processing of queries on large-scale RDF data [J] . Pingpeng Yuan, Changfeng Xie, Hai Jin, Knowledge and information systems . 2014,第2期

机译：动态和快速处理大规模RDF数据的查询
2. Adaptive mechanism for distributed query processing and data loading using the RDF data in the cloud [J] . Dharmaraj Chandrasekaran Ranichandra, Tripathy BalaKrushna International journal of communication systems . 2018,第15期

机译：使用云中的RDF数据进行分布式查询处理和数据加载的自适应机制
3. RIQ: Fast processing of SPARQL queries on RDF quadruples [J] . Katib Anas, Slavov Vasil, Rao Praveen Journal of web semantics: . 2016,第MARa期

机译：RIQ：快速处理RDF四倍的SPARQL查询
4. Fast Processing SPARQL Queries on Large RDF Data [C] . Guang Yang, Pingpeng Yuan, Hai Jin 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress . 2016

机译：快速处理大型RDF数据的SPARQL查询
5. A new approach for fast processing of SPARQL queries on RDF quadruples [D] . Slavov, Vasil Georgiev 2015

机译：快速处理RDF四倍的SPARQL查询的新方法
6. Processing SPARQL queries with regular expressions in RDF databases [O] . Jinsoo Lee, Minh-Duc Pham, Jihwan Lee, 2011

机译：使用RDF数据库中的正则表达式处理SPARQL查询
7. Towards Load Balancing and Parallelizing of RDF Query Processing in P2P Based Distributed RDF Data Stores [O] . Liaquat Ali, Thomas Janson, Christian Schindelhauer 2015

机译：基于p2p的分布式RDF数据存储中RDF查询处理的负载均衡与并行化

Dynamic and fast processing of queries on large-scale RDF data

摘要

著录项

相似文献

相关主题

期刊订阅