...
首页> 外文期刊>In silico biology: An international on computational biology >Linking Experimental Results, Biological Networks and Sequence Analysis Methods Using Ontologies and Generalised Data Structures
【24h】

Linking Experimental Results, Biological Networks and Sequence Analysis Methods Using Ontologies and Generalised Data Structures

机译:使用本体和广义数据结构链接实验结果,生物网络和序列分析方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The structure of a closely integrated data warehouse is described that is designed to link different types and varying numbers of biological networks, sequence analysis methods and experimental results such as those coming from microarrays. The data schema is inspired by a combination of graph based methods and generalised data structures and makes use of ontologies and meta-data. The core idea is to consider and store biological networks as graphs, and to use generalised data structures (GDS) for the storage of further relevant information. This is possible because many biological networks can be stored as graphs: protein interactions, signal transduction networks, metabolic pathways, gene regulatory networks etc. Nodes in biological graphs represent entities such as promoters, proteins, genes and transcripts whereas the edges of such graphs specify how the nodes are related. The semantics of the nodes and edges are defined using ontologies of node and relation types. Besides generic attributes that most biological entities possess (name, attribute description), further information is stored using generalised data structures. By directly linking to underlying sequences (exons, introns, promoters, amino acid sequences) in a systematic way, close interoperability to sequence analysis methods can be achieved. This approach allows us to store, query and update a wide variety of biological information in a way that is semantically compact without requiring changes at the database schema level when new kinds of biological information is added. We describe how this datawarehouse is being implemented by extending the text-mining framework ONDEX to link, support and complement different bioinformatics applications and research activities such as microarray analysis, sequence analysis and modelling/simulation of biological systems. The system is developed under the GPL license and can be downloaded from http://sourceforge.net/projects/ondex/
机译:描述了紧密集成的数据仓库的结构,该结构旨在链接不同类型和不同数量的生物网络,序列分析方法和实验结果,例如来自微阵列的结果。数据模式的灵感来自于基于图的方法和通用数据结构的结合,并利用了本体和元数据。核心思想是将生物网络视为图形并将其存储,并使用广义数据结构(GDS)来存储其他相关信息。这是可能的,因为许多生物网络都可以以图形形式存储:蛋白质相互作用,信号转导网络,代谢途径,基因调控网络等。生物图形中的节点表示诸如启动子,蛋白质,基因和转录本之类的实体,而这些图形的边缘指定了节点如何关联。节点和边的语义是使用节点和关系类型的本体定义的。除了大多数生物实体拥有的通用属性(名称,属性描述)外,还使用通用数据结构存储更多信息。通过以系统的方式直接链接到基础序列(外显子,内含子,启动子,氨基酸序列),可以实现与序列分析方法的紧密互操作性。这种方法使我们能够以语义上紧凑的方式存储,查询和更新各种各样的生物信息,而无需在添加新种类的生物信息时在数据库架构级别进行更改。我们通过扩展文本挖掘框架ONDEX来链接,支持和补充不同的生物信息学应用程序和研究活动(如微阵列分析,序列分析和生物系统建模/仿真)来描述如何实现此数据仓库。该系统是根据GPL许可开发的,可以从http://sourceforge.net/projects/ondex/下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号