...
首页> 外文期刊>BMC Bioinformatics >KaBOB: ontology-based semantic integration of biomedical databases
【24h】

KaBOB: ontology-based semantic integration of biomedical databases

机译:KaBOB:基于本体的生物医学数据库语义集成

获取原文
           

摘要

Background The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources. Results We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license. Conclusions KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.
机译:背景技术使用通用的基于本体的语义模型查询许多独立的生物数据库的能力将有助于更深入地集成和更有效地利用这些多样化且快速增长的资源。尽管正在进行有关共享数据格式和链接标识符的工作,但是语义数据集成仍然存在重大问题,以便跨异构生物医学数据源建立共享身份和共享含义。结果我们提出了五个语义数据集成过程,这些过程共同应用可解决七个关键问题。这些过程包括:明确生物医学概念与数据库记录之间的区别;在数据源之间聚集表示相同生物医学概念的标识符集;使用声明性表示的前向链规则获取源数据库中可变表示的信息,并将其集成到数据库中。一致的生物医学表征。我们通过展示KaBOB(生物医学知识库)来演示这些过程和解决方案,KaBOB是一个基于18种重要生物医学数据库的语义集成数据的知识库,使用的基础是开放式生物医学本体论。可以使用大约5亿个RDF三元组来构建包含人类和七个主要模型生物数据的KaBOB实例。用于构建KaBOB的所有源代码均已获得开源许可。结论KaBOB是生物医学数据的综合知识库,其代表形式是基于著名的,积极维护的开放式生物医学本体论,因此可以根据生物医学概念(例如,基因和基因产物,相互作用和过程)而非来源特征来查询基础数据。特定的数据模式或文件格式。 KaBOB解决了许多生物医学研究人员经常困扰的问题,这些生物医学研究人员打算使用来自多个数据源的数据,并为进行中的数据集成和开发以及对大量集成的生物医学数据进行正式推理提供了平台。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号