Heterogeneous biological data integration with declarative query language

Nguyen H.; Michel L.; Thompson J.D.; Poch O.

首页> 外文期刊>IBM Journal of Research and Development >Heterogeneous biological data integration with declarative query language

【24h】

Heterogeneous biological data integration with declarative query language

机译：声明性查询语言的异构生物数据集成

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The requirements for scalable data integration systems for modern biology are indisputable, due to the very large, heterogeneous, and complex datasets available in public databases. The management and fusion of this “big data” with local databases represents a major challenge, since it underlies the computational inferences and models that will be subsequently generated and validated experimentally. In this paper, we present an alternative conception for local data integration, called BIRD (Biological Integration and Retrieval Data), based on four concepts: (i) a hybrid flat file and relational database architecture permits the rapid management of large volumes of heterogeneous datasets; (ii) a generic data model allows the simultaneous organization and classification of local databases according to real-world requirements; (iii) configuration rules are used to divide and map each data resource into several data model entities; and (iv) a simple, declarative query language (BIRD-QL) facilitates information extraction from heterogeneous datasets. This flexible, generic design allows the integration of diverse data formats in a searchable database with high-level functionalities depending on the specific scientific context. It has been validated in the context of real world projects, notably the SM2PH (Structural Mutation to the Phenotypes of Human Pathologies) project.

机译：现代生物学对可伸缩数据集成系统的要求是无可争辩的，因为公共数据库中有非常大，异构和复杂的数据集。这种“大数据”与本地数据库的管理和融合是一项重大挑战，因为它是计算推断和模型的基础，这些计算和模型随后将通过实验进行生成和验证。在本文中，我们基于四个概念提出了一种本地数据集成的替代概念，称为BIRD（生物集成和检索数据）：（i）混合平面文件和关系数据库体系结构允许快速管理大量异构数据集; （ii）通用数据模型允许根据实际需求同时组织和分类本地数据库；（iii）使用配置规则将每个数据资源划分并映射到几个数据模型实体中；（iv）一种简单的声明式查询语言（BIRD-QL）有助于从异构数据集中提取信息。这种灵活的通用设计允许根据特定的科学环境将各种数据格式集成到具有高级功能的可搜索数据库中。它已在现实世界项目中得到了验证，特别是SM2PH（人类病理学表型的结构变异）项目。

著录项

来源
《IBM Journal of Research and Development》 |2014年第2期|1-12|共12页
作者
Nguyen H.; Michel L.; Thompson J.D.; Poch O.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Heterogeneous biological data integration with declarative query language [J] . Nguyen H., Michel L., Thompson J.D., IBM Journal of Research and Development . 2014,第2a3期

机译：声明性查询语言的异构生物数据集成
2. Lightweight integration and natural language querying of heterogeneous data services [J] . Silvia Quarteroni Intelligenza Artificiale . 2012,第2期

机译：异构数据服务的轻量级集成和自然语言查询
3. Limit Datalog: A Declarative Query Language for Data Analysis [J] . Bernardo Cuenca Grau, Ian Horrocks, Mark Kaminski, SIGMOD record . 2019,第4期

机译：限制数据记录：数据分析的声明性查询语言
4. XML Query Optimization and Wrapping Query Languages for Heterogeneous Information Integration [C] . Takashi HAYASHI, Kazuya KONISHI, Kyotaro HORIGUCHI, 7th World Multiconference on Systemics, Cybernetics and Informatics(SCI 2003) vol.2: Computer Science and Engineering . 2003

机译：用于异构信息集成的XML查询优化和包装查询语言
5. Adaptive Spatiotemporal Data Integration Using Distributed Query Relaxation over Heterogeneous Observational Datasets [D] . Mitra, Saptashwa. 2018

机译：使用分布式查询放松在异构观测数据集上的自适应时空数据集成
6. PQL: a declarative query language over dynamic biological schemata. [O] . P. Mork, R. Shaker, A. Halevy, 2002

机译：PQL：一种在动态生物图式上的声明性查询语言。
7. PQL: A Declarative Query Language over Dynamic Biological Schemata [O] . Mork Peter, Shaker R., Halevy A., 2002

机译：PQL：动态生物图式的声明性查询语言

Heterogeneous biological data integration with declarative query language

摘要

著录项

相似文献

相关主题

期刊订阅