首页> 外文会议>Extended semantic web conference >Using Shape Expressions (ShEx) to Share RDF Data Models and to Guide Curation with Rigorous Validation
【24h】

Using Shape Expressions (ShEx) to Share RDF Data Models and to Guide Curation with Rigorous Validation

机译:使用形状表达式(SHEX)共享RDF数据模型并用严格验证指导策策

获取原文

摘要

We discuss Shape Expressions (ShEx), a concise, formal, modeling and validation language for RDF structures. For instance, a Shape Expression could prescribe that subjects in a given RDF graph that fall into the shape "Paper" are expected to have a section called "Abstract", and any ShEx implementation can confirm whether that is indeed the case for all such subjects within a given graph or subgraph. There are currently five actively maintained ShEx implementations. We discuss how we use the JavaScript, Scala and Python implementations in RDF data validation workflows in distinct, applied contexts. We present examples of how ShEx can be used to model and validate data from two different sources, the domain-specific Fast Healthcare Interoperability Resources (FHIR) and the domain-generic Wikidata knowledge base, which is the linked database built and maintained by the Wikime-dia Foundation as a sister project to Wikipedia. Example projects that are using Wikidata as a data curation platform are presented as well, along with ways in which they are using ShEx for modeling and validation. When reusing RDF graphs created by others, it is important to know how the data is represented. Current practices of using human-readable descriptions or ontologies to communicate data structures often lack sufficient precision for data consumers to quickly and easily understand data representation details. We provide concrete examples of how we use ShEx as a constraint and validation language that allows humans and machines to communicate unambiguously about data assets. We use ShEx to exchange and understand data models of different origins, and to express a shared model of a resource's footprint in a Linked Data source. We also use ShEx to agilely develop data models, test them against sample data, and revise or refine them. The expressivity of ShEx allows us to catch disagreement, inconsistencies, or errors efficiently, both at the time of input, and through batch inspections. ShEx addresses the need of the Semantic Web community to ensure data quality for RDF graphs. It is currently being used in the development of FHIR/RDF. The language is sufficiently expressive to capture constraints in FHIR, and the intuitive syntax helps people to quickly grasp the range of conformant documents. The publication workflow for FHIR tests all of these examples against the ShEx schemas, catching non-conformant data before they reach the public. ShEx is also currently used in Wikidata projects such as Gene Wiki and WikiCite to develop quality-control pipelines to maintain data integrity and incorporate or harmonize differences in data across different parts of the pipelines.
机译:我们讨论RDF结构的简明,正式,建模和验证语言的形状表达式(SHEX)。例如,形状表达式可以在给定的RDF图中规定落入形状“纸”的给定RDF图中的受试者有一个名为“摘要”的部分,并且任何SHEX实现都可以确认所有这些受试者是否确实是这种情况在给定的图形或子图中。目前有五个积极维护SHEX实施。我们讨论我们如何在不同的应用上下文中使用RDF数据验证工作流中的JavaScript,Scala和Python实现。我们提出了Shex如何用于模拟和验证来自两个不同源的数据的示例,特定于域的快速医疗保健互操作性资源(FHIR)和域 - 通用Wikidata知识库,这是由Wikime构建和维护的链接数据库-dia基金会作为维基百科的姐妹项目。还提供了使用Wikidata作为数据策策平台的示例项目,以及它们使用SHEX进行建模和验证的方式。在重用由其他人创建的RDF图形时,重要的是要知道如何表示数据。使用人类可读描述或本体进行传播数据结构的当前实践通常缺乏足够的精确度,以便快速且容易地理解数据表示细节。我们提供了如何使用Shex作为约束和验证语言的具体示例,这些语言允许人类和机器明确兑现数据资产。我们使用SHEX来交换和理解不同起源的数据模型,并在链接数据源中表达资源占用的共享模型。我们还使用Shex来禁止开发数据模型,对其进行样本数据测试它们,并修改或改进它们。 SHEX的表现允许我们在输入时有效地捕获分歧,不一致或误差,并通过批量检查。 SHEX满足了语义网络社区的需要,以确保RDF图形的数据质量。它目前正在用于开发FHIR / RDF。该语言足以追查FHIR中的约束,直观的语法有助于人们快速掌握符合文件范围。 FHIR的出版工作流程测试所有这些示例对SHEX模式,在达到公众之前捕获非符合数据。 ShEx也目前在维基数据项目,如基因维基和WikiCite用于开发质量控制管道,以保持数据的完整性,并纳入或在横跨管道的不同部分数据和声差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号