...
首页> 外文期刊>IBM Journal of Research and Development >Heterogeneous biological data integration with declarative query language
【24h】

Heterogeneous biological data integration with declarative query language

机译:声明性查询语言的异构生物数据集成

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The requirements for scalable data integration systems for modern biology are indisputable, due to the very large, heterogeneous, and complex datasets available in public databases. The management and fusion of this “big data” with local databases represents a major challenge, since it underlies the computational inferences and models that will be subsequently generated and validated experimentally. In this paper, we present an alternative conception for local data integration, called BIRD (Biological Integration and Retrieval Data), based on four concepts: (i) a hybrid flat file and relational database architecture permits the rapid management of large volumes of heterogeneous datasets; (ii) a generic data model allows the simultaneous organization and classification of local databases according to real-world requirements; (iii) configuration rules are used to divide and map each data resource into several data model entities; and (iv) a simple, declarative query language (BIRD-QL) facilitates information extraction from heterogeneous datasets. This flexible, generic design allows the integration of diverse data formats in a searchable database with high-level functionalities depending on the specific scientific context. It has been validated in the context of real world projects, notably the SM2PH (Structural Mutation to the Phenotypes of Human Pathologies) project.
机译:现代生物学对可伸缩数据集成系统的要求是无可争辩的,因为公共数据库中有非常大,异构和复杂的数据集。这种“大数据”与本地数据库的管理和融合是一项重大挑战,因为它是计算推断和模型的基础,这些计算和模型随后将通过实验进行生成和验证。在本文中,我们基于四个概念提出了一种本地数据集成的替代概念,称为BIRD(生物集成和检索数据):(i)混合平面文件和关系数据库体系结构允许快速管理大量异构数据集; (ii)通用数据模型允许根据实际需求同时组织和分类本地数据库; (iii)使用配置规则将每个数据资源划分并映射到几个数据模型实体中; (iv)一种简单的声明式查询语言(BIRD-QL)有助于从异构数据集中提取信息。这种灵活的通用设计允许根据特定的科学环境将各种数据格式集成到具有高级功能的可搜索数据库中。它已在现实世界项目中得到了验证,特别是SM2PH(人类病理学表型的结构变异)项目。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号