首页> 美国卫生研究院文献>BMC Bioinformatics >Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set
【2h】

Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

机译:通过基于序列的矩阵格式和关联规则集自动进行基因组数据挖掘

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

There is an enormous amount of information encoded in each genome – enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands.
机译:每个基因组中都有大量的信息编码-足以产生活泼,反应灵敏和适应性强的生物。仅原始序列数据不足以理解功能,机制或相互作用。单个碱基对的改变可导致疾病,例如镰状细胞性贫血,而某些大型碱基的缺失没有明显的表型效应。基因组特征的数据类型各不相同,这些特征的注释分布在多个数据库中。在本文中,我们开发了一种通过迭代探索序列数据之间的相关性并在其上建立基础来自动进行基因组探索的方法。首先,为了整合和比较不同的注释源,开发了一个包含位置相关信息的序列矩阵(SM)。其次,为矩阵行类型开发了一个分类树,指定了相对于其他数据类型如何对待每种数据类型以进行分析。第三,在分类树的指导下,开发了相关分析以根据其他行来分析每个矩阵行的特征。开发了原型并成功检测了基因,外显子,重复元件和CpG岛之间的一致基因组特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号