首页> 外文学位 >The sequence comparison index: A novel method for comparing proteins and proteomes.
【24h】

The sequence comparison index: A novel method for comparing proteins and proteomes.

机译:序列比较指数:一种比较蛋白质和蛋白质组的新方法。

获取原文
获取原文并翻译 | 示例

摘要

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list.; Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome.; As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees.; Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms.
机译:历史上的形态特征被用作分类生物的主要手段。但是,分子遗传学的时代使我们能够从有机体的遗传密码的角度研究这一领域。早期工作使用了高度保守的序列,例如核糖体RNA。公共数据库中完整基因组的数量不断增加,这提供了不仅查看单个基因,而且查看生物的整个零件清单的机会。这里介绍了序列比较指数(SCI)和生物比较指数(OCI),以及用于比较蛋白质和蛋白质组的算法和方法。比较了104个测序生物的完整蛋白质组。对具有合理预期相关性的序列对进行了超过2.8亿次的Smith-Waterman完全比对。从这些比对中,构建了整个蛋白质组系统树。此方法还用于比较每种生物的小亚基(SSU)rRNA和根据这些结果构建的树。通过SCI / OCI方法获得的SSU rRNA树看起来非常像来自诸如核糖体数据库计划等来源的公认的SSU rRNA树,从而验证了该方法。与SSU rRNA树和通过其他方法构建的蛋白质组树相比,SCI / OCI蛋白质组树显示出一些小的但明显的差异。水平基因转移似乎不会影响SCI / OCI树,直到转移的基因占蛋白质组的很大一部分为止。作为这项工作的一部分,创建了相关局部比对数据库(DaRLA),其中包含超过8100万行的序列比对信息。 DaRLA虽然主要用于构建整个蛋白质组树,但也可以应用于共享的基因含量分析,基因顺序分析以及创建单个蛋白质树。最后,将用于分析共享基因含量的标准BLAST方法与使用4个螺旋的SCI方法进行了比较。 SCI系统运行无懈可击,可以找到来自一个生物体自身的所有蛋白质,并找到生物体之间的所有核糖体蛋白质。 BLAST系统错过了来自其各自生物的一些蛋白质,并且未能检测到生物之间的小核糖体蛋白质。

著录项

  • 作者

    McLeod, Michael Patrick.;

  • 作者单位

    The University of Texas Health Science Center at Houston Graduate School of Biomedical Sciences.;

  • 授予单位 The University of Texas Health Science Center at Houston Graduate School of Biomedical Sciences.;
  • 学科 Biology Molecular.; Biology Genetics.; Computer Science.
  • 学位 Ph.D.
  • 年度 2003
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 分子遗传学;遗传学;自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号