首页> 外文期刊>BMC Genomics >Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes
【24h】

Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes

机译:人类和小鼠基因组中编码和非编码保守序列标签的全基因组鉴定

获取原文
           

摘要

Background The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent) on the availability of annotated proteins. Results In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.
机译:背景技术在基因组序列注释中,基因的准确检测和功能区的鉴定仍然是一个未解决的问题。这个问题不仅影响新的基因组,而且影响到人类和老鼠等经过充分研究的生物的基因组,尽管付出了巨大的努力,但基因和调控区的清单仍远远不够。比较基因组学是解决此问题的有效方法。不幸的是,它受到执行全基因组比较所需的计算要求以及区分保守编码序列和非编码序列的问题的限制。这种区分通常基于(因此取决于)带注释的蛋白质的可用性。结果在本文中,我们介绍了使用新的基于高通量网格的系统对人和小鼠基因组进行全面比较的结果,该系统可以快速检测保守序列并准确评估其编码潜力。通过检测编码保守序列的簇,该系统还适合于准确鉴定潜在的基因座。经过此分析,我们创建了人类-小鼠保守序列标签的集合,并将我们的结果与可靠的注释进行了仔细比较,以对我们分类的可靠性进行基准测试。令人惊讶的是,我们能够检测到由EST序列支持但尚未与尚未注释的基因相对应的几个潜在基因位点。结论在这里,我们提出了一个新的系统,该系统可以对基因组进行全面比较,以检测保守的编码和非编码序列,并鉴定潜在的基因座。我们的系统不需要任何注释的序列,因此适用于分析新的或注释不充分的基因组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号