Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data

机译：来源上下文实体（PaCE）：科学RDF数据的可扩展来源跟踪

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Resource Description Framework (RDF) format is being used by a large number of scientific applications to store and disseminate their datasets. The provenance information, describing the source or lineage of the datasets, is playing an increasingly significant role in ensuring data quality, computing trust value of the datasets, and ranking query results. Current provenance tracking approaches using the RDF reification vocabulary suffer from a number of known issues, including lack of formal semantics, use of blank nodes, and application-dependent interpretation of reified RDF triples. In this paper, we introduce a new approach called Provenance Context Entity (PaCE) that uses the notion of provenance context to create provenance-aware RDF triples. We also define the formal semantics of PaCE through a simple extension of the existing RDF(S) semantics that ensures compatibility of PaCE with existing Semantic Web tools and implementations. We have implemented the PaCE approach in the Biomedical Knowledge Repository (BKR) project at the US National Library of Medicine. The evaluations demonstrate a minimum of 49% reduction in total number of provenance-specific RDF triples generated using the PaCE approach as compared to RDF reification. In addition, performance for complex queries improves by three orders of magnitude and remains comparable to the RDF reification approach for simpler provenance queries.

机译：许多科学应用程序都使用资源描述框架（RDF）格式来存储和分发其数据集。描述数据集的来源或沿袭的出处信息在确保数据质量，计算数据集的信任值以及对查询结果进行排名中起着越来越重要的作用。当前使用RDF验证词汇的出处跟踪方法存在许多已知问题，包括缺乏形式语义，空白节点的使用以及对RDF三元组的依赖于应用程序的解释。在本文中，我们介绍了一种称为“来源上下文实体”（PaCE）的新方法，该方法使用来源上下文的概念来创建可识别来源的RDF三元组。我们还通过简单扩展现有RDF（S）语义来定义PaCE的形式语义，以确保PaCE与现有语义Web工具和实现的兼容性。我们已经在美国国家医学图书馆的生物医学知识存储库（BKR）项目中实施了PaCE方法。评估表明，与RDF精化相比，使用PaCE方法生成的特定于源的RDF三元组总数至少减少了49％。此外，复杂查询的性能提高了三个数量级，并且与用于更简单来源查询的RDF验证方法保持可比性。

著录项

来源
《Scientific and statistical database management》|2010年|p.461-470|共10页
会议地点 Heidelberg(DE);Heidelberg(DE)
作者
Satya S. Sahoo; Olivier Bodenreider; Pascal Hitzler; Amit Sheth; Krishnaprasad Thirunarayan;
展开▼
作者单位

Kno.e.sis Center, Computer Science and Engineering Department, Wright State University, Dayton, OH, USA;

Lister Hill National Center for Biomedical Communications, National Library of Medicine, NIH, Bethesda, MD, USA;

Kno.e.sis Center, Computer Science and Engineering Department, Wright State University, Dayton, OH, USA;

Kno.e.sis Center, Computer Science and Engineering Department, Wright State University, Dayton, OH, USA;

Kno.e.sis Center, Computer Science and Engineering Department, Wright State University, Dayton, OH, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词
provenance context entity; biomedical knowledge repository; context theory; RDF reification; provenir ontology;

机译：来源上下文实体；生物医学知识库；语境理论RDF修改；普罗旺斯本体;

相似文献

外文文献
中文文献
专利

1. RDFProv: A relational RDF store for querying and managing scientific workflow provenance [J] . Artem Chebotko, Shiyong Lu, Xubo Fei, Data & Knowledge Engineering . 2010,第8期

机译：RDFProv：用于查询和管理科学工作流程来源的关系RDF存储
2. Relationalization of provenance data in complex RDF reification nodes [J] . Sunitha Ramanujam, Anubha Gupta, Latifur Khan, Electronic Commerce Research . 2010,第3a4期

机译：复杂RDF验证节点中源数据的关系化
3. The contribution and reuse of LTER data in the Provenance Aware Synthesis Tracking Architecture (PASTA) data repository [J] . Servilla Mark, Brunt James, Costa Duane, Ecological informatics: an international journal on ecoinformatics and computational ecology . 2016,第Null期

机译：来源识别综合跟踪体系结构（PASTA）数据存储库中LTER数据的贡献和重用
4. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data [C] . Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, International Conference on Scientific and Statistical Database Management . 2010

机译：出差上下文实体（PACE）：科学RDF数据的可扩展性出处跟踪
5. Enabling Reproducibility of Scientific Data Flows Through Tracking and Representation of Provenance. [D] . Tilmes, Curt. 2011

机译：通过跟踪和表示来源来实现科学数据流的可重复性。
6. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data [O] . Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, -1

机译：出差上下文实体（PACE）：科学RDF数据的可扩展性出处跟踪
7. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data [O] . Sahoo Satya, Bodenreider Olivier, Hitzler Pascal, 2014

机译：来源上下文实体（PaCE）：科学RDF数据的可扩展来源跟踪

Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅