...
首页> 外文期刊>Semantic web >Scalable long-term preservation of relational data through SPARQL queries
【24h】

Scalable long-term preservation of relational data through SPARQL queries

机译:通过SPARQL查询可伸缩地长期保存关系数据

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We present an approach for scalable long-term preservation of data stored in relational databases (RDBs) as RDF, implemented in the SAQ (Semantic Archive and Query) system. The proposed approach is suitable for archiving scientific data used in scientific publications where it is desirable to preserve only parts of an RDB, e.g. only data about a specific set of experimental artefacts in the database. With the approach, long-term preservation as RDF of selected parts of a database is specified as an archival query in an extended SPARQL dialect, A-SPARQL. The query processing is based on automatically generating an RDF view of a relational database to archive, called the RD-view. A-SPARQL provides flexible selection of data to be archived in terms of a SPARQL-like query to the RD-view. The result of an archival query is a data archive file containing the RDF-triples representing the relational data content to be preserved. The system also generates a schema archive file where sufficient meta-data are saved to allow the archived database to be fully reconstructed. An archival query usually selects both properties and their values for sets of subjects, which makes the property p in some triple patterns unknown. We call such queries where properties are unknown unbound-property queries. To achieve scalable data preservation and recreation, we propose some query transformation strategies suitable for optimizing unbound-property queries. These query rewriting strategies were implemented and evaluated in a new benchmark for archival queries called ABench. ABench is defined as set of typical A-SPARQL queries archiving selected parts of databases generated by the Berlin benchmark data generator. In experiments, the SAQ optimization strategies were evaluated by measuring the performance of A-SPARQL queries selecting triples for archival in ABench. The performance of equivalent SPARQL queries for related systems was also measured. The results showed that the proposed optimizations substantially improve the query execution time for archival queries.
机译:我们提出了一种在SAQ(语义存档和查询)系统中实现的,可伸缩的长期保存关系数据库(RDB)中的数据作为RDF的方法。所提出的方法适用于存档仅需要保留RDB的部分内容(例如RDB)的科学出版物中使用的科学数据。仅包含数据库中一组特定的实验人工制品的数据。通过这种方法,可以将数据库的选定部分作为RDF的长期保存指定为扩展SPARQL语言A-SPARQL中的档案查询。查询处理基于自动生成要归档的关系数据库的RDF视图(称为RD视图)。 A-SPARQL可以根据对RD视图的类似SPARQL的查询,灵活选择要归档的数据。归档查询的结果是一个数据归档文件,其中包含表示要保留的关系数据内容的RDF三元组。该系统还生成一个架构存档文件,其中保存了足够的元数据以允许完全重建存档的数据库。档案查询通常会为主题集选择属性及其值,这使得未知的某些三重模式的属性p成为可能。我们称这类查询为未知属性的非绑定查询。为了实现可伸缩的数据保存和重新创建,我们提出了一些查询转换策略,这些策略适用于优化非绑定属性查询。这些查询重写策略是在称为ABench的档案查询新基准中实施和评估的。 ABench定义为一组典型的A-SPARQL查询,这些查询将由Berlin基准数据生成器生成的数据库的选定部分存档。在实验中,通过测量A-SPARQL查询的性能来评估SAQ优化策略,这些查询选择ABench中的三元组进行归档。还测量了相关系统的等效SPARQL查询的性能。结果表明,所提出的优化方案大大改善了档案查询的查询执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号