首页> 外文会议>International Conference on Big Data Computing and Communications >Query Optimization for Massive RDF Data Based on Spark
【24h】

Query Optimization for Massive RDF Data Based on Spark

机译:基于Spark的海量RDF数据查询优化

获取原文

摘要

Sparql (SPARQL Protocol and RDF Query Language) is a query language and data acquisition protocol designed for RDF development. Although it is defined for the RDF data model developed by the W3C, it can be used in any form of RDF to represent data resources. With the explosive growth of web information resources, more and more data is using RDF structure. The research and obtaining of useful information in massive data has become a major challenge. Efficient search and effective query has become the focus attention of research. In this paper, we design an efficient optimization method by finding a semantic connection chain in the system (SparkIlink). Data was stored on the file system of hadoop (HDFS). Based on Spark framework with efficient distributed memory, this system has achieved efficient searching and optimizing performance for massive RDF data. Our work includes the following mechanism: (1) using vertical partition as data storage structure; (2) using twice data statistics; (3) using information connection chain based on semantic. Our system can support massive triples query in distributed environment to achieve efficient query processing. The experiment of this paper is based on the latest SPARQLGX on the spark platform RDF system. In contrast, our system is more efficient in data search than SPARQLGX.
机译:Sparql(SPARQL协议和RDF查询语言)是一种用于RDF开发的查询语言和数据采集协议。尽管它是为W3C开发的RDF数据模型定义的,但它可以以任何形式的RDF用来表示数据资源。随着Web信息资源的爆炸性增长,越来越多的数据正在使用RDF结构。在海量数据中研究和获取有用信息已成为一项重大挑战。高效的搜索和有效的查询已成为研究的重点。在本文中,我们通过找到系统中的语义连接链(SparkIlink)设计了一种有效的优化方法。数据存储在hadoop(HDFS)的文件系统中。该系统基于具有高效分布式内存的Spark框架,实现了对大量RDF数据的高效搜索和优化性能。我们的工作包括以下机制:(1)使用垂直分区作为数据存储结构; (2)使用两次数据统计; (3)使用基于语义的信息连接链。我们的系统可以在分布式环境中支持大量的三元组查询,以实现高效的查询处理。本文的实验基于spark平台RDF系统上的最新SPARQLGX。相反,我们的系统在数据搜索方面比SPARQLGX更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号