首页> 外文会议>OnTheMove Confederated International Conferences >An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research
【24h】

An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research

机译:一种可扩展的本体学建模方法,使用职位研究生物医学研究中的语义出处

获取原文

摘要

Provenance metadata describing the source or origin of data is critical to verify and validate results of scientific experiments. Indeed, reproducibility of scientific studies is rapidly gaining significant attention in the research community, for example biomedical and healthcare research. To address this challenge in the biomedical research domain, we have developed the Provenance for Clinical and Healthcare Research (ProvCaRe) using World Wide Web Consortium (W3C) PROV specifications, including the PROV Ontology (PROV-O). In the ProvCaRe project, we are extending PROV-O to create a formal model of provenance information that is necessary for scientific reproducibility and replication in biomedical research. However, there are several challenges associated with the development of the ProvCaRe ontology, including: (1) Ontology engineering: modeling all biomedical provenance-related terms in an ontology has undefined scope and is not feasible before the release of the ontology; (2) Redundancy: there are a large number of existing biomedical ontologies that already model relevant biomedical terms; and (3) Ontology maintenance: adding or deleting terms from a large ontology is error prone and it will be difficult to maintain the ontology over time. Therefore, in contrast to modeling all classes and properties in an ontology before deployment (also called precoordination), we propose the "ProvCaRe Compositional Grammar Syntax" to model ontology classes on-demand (also called postcoordination). The compositional grammar syntax allows us to re-use existing biomedical ontology classes and compose provenance-specific terms that extend PROV-O classes and properties. We demonstrate the application of this approach in the ProvCaRe ontology and the use of the ontology in the development of the ProvCaRe knowledgebase that consists of more than 38 million provenance triples automatically extracted from 384,802 published research articles using a text processing workflow.
机译:描述源或数据源或数据来源的原子生物数据对于验证和验证科学实验的结果至关重要。实际上,科学研究的再现性在研究界迅速越来越大,例如生物医学和医疗保健研究。为了解决生物医学研究领域的这一挑战,我们已经使用万维网联盟(W3C)证明书(包括PROM Ontology(Prov-O)为临床和医疗保健研究(Provcare)制定了临床和医疗保健研究的出处。在Provcare项目中,我们正在扩大Prov-O,以创建一个正式的出处信息模型,这是生物医学研究中的科学再现性和复制所必需的。然而,与Provcare本体的发展有几种挑战,包括:(1)本体工程:在本体中建模所有与生物医学出处相关术语具有未定义的范围,并且在本体释放之前是不可行的; (2)冗余:有大量现有的生物医学本体,已经模拟了相关的生物医学术语; (3)本体维护:从大型本体中添加或删除术语是易于错误的,并且难以随时间保持本体。因此,与在部署之前建模本体中的所有类和属性(也称为预订),我们提出了“ProvCare Composital语法语法”,以便按需模拟本体类别(也称为售后)。组成语法语法允许我们重用现有的生物医学本体类别,并撰写了扩展Prov-O类和属性的特定特定术语。我们展示了这种方法在Provcare本体和本体在发展中的应用中的应用,这些方法在开发中由超过3800万个出处三分之一组成的Provcare知识库,从384,802发布的研究文章中提取了384,802个出版的研究文章。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号