首页> 外文学位 >Family specific protein sequence scoring matrices and applications.
【24h】

Family specific protein sequence scoring matrices and applications.

机译:家庭特定的蛋白质序列评分矩阵和应用。

获取原文
获取原文并翻译 | 示例

摘要

All the alignment methods require an alignment algorithm and scoring scheme that include substitution scores and gap penalties. It is sometimes suggested that the proper scoring matrix is the most critical technical element in a successful protein database search. Any improvement would benefit to all the processes mentioned above. This dissertation investigates a new way to construct family specific scoring matrix and constructs scoring matrices for different protein structures then tests their performance in terms of sequence alignment and database search.; We extended Dayhoff's work and estimated amino acid substitution model for individual protein family from alignments of sequences with varying degree of divergence using a maximum likelihood method. Different protein families showed some unique mutation patterns. The performance test showed that family specific matrix performed significantly better than other matrices in the detection of remote homologues. However, the family specific matrix did not improve the alignment quality significantly. The cluster analysis of enzymes in two different pathways showed that although the enzymes in the same pathway were likely to be grouped closer together, the separation was not perfect. The structure similarity may be the key reason for such separation.; We derived the scoring matrices for different structures as defined in OATH database since different structures should have different substitution patterns. The cluster analysis of those matrices constructed at architecture level showed that the relationship between these matrices were in general consistent with the OATH classification. Structure-based matrix significantly improved the sequence alignment quality, this was specially true for the matrix specific to a particular structure architecture. Structure-based matrix also performed better than BIOSUM matrix in the fold recognition test; for the TIM barrel fold and Rossmann fold, the fold specific matrices were the best performers. Finally, we showed that such structure based matrices can be useful in the genome annotation.; The matrices derived in this study can be used in the other database search program such as PSI-BLAST to improve the accuracy. The cluster analysis of mutation matrices contributes to our understanding of metabolic evolution and provides impetus for further research in the pathway evolution.
机译:所有比对方法都需要一种比对算法和评分方案,其中包括替代分数和空位罚分。有时建议正确的评分矩阵是成功进行蛋白质数据库搜索的最关键的技术要素。任何改进将有益于上述所有过程。本文研究了构建家族特异性评分矩阵,构建不同蛋白质结构评分矩阵的新方法,然后通过序列比对和数据库搜索测试其性能。我们使用最大似然法从不同比对程度的序列比对中扩展了Dayhoff的工作,并估计了单个蛋白质家族的氨基酸取代模型。不同的蛋白质家族显示出一些独特的突变模式。性能测试表明,在检测远程同源物方面,家族特异性基质的性能明显优于其他基质。但是,家族特异性基质不能显着提高比对质量。两种不同途径中酶的聚类分析表明,尽管同一途径中的酶可能更靠近在一起,但分离并不完美。结构相似性可能是这种分离的主要原因。我们导出了OATH数据库中定义的不同结构的评分矩阵,因为不同的结构应该具有不同的替换模式。在体系结构级别构建的那些矩阵的聚类分析表明,这些矩阵之间的关系通常与OATH分类一致。基于结构的矩阵显着提高了序列比对质量,这对于特定于特定结构体系结构的矩阵而言尤其如此。在折叠识别测试中,基于结构的矩阵的性能也优于BIOSUM矩阵。对于TIM筒形折叠和Rossmann折叠,特定矩阵的折叠性能最佳。最后,我们证明了这种基于结构的矩阵可用于基因组注释。可以将本研究中得出的矩阵用于其他数据库搜索程序(例如PSI-BLAST)中,以提高准确性。突变矩阵的聚类分析有助于我们对代谢进化的理解,并为进一步研究途径进化提供了动力。

著录项

  • 作者

    Fan, Yiping.;

  • 作者单位

    University of California, San Diego.;

  • 授予单位 University of California, San Diego.;
  • 学科 Engineering Biomedical.
  • 学位 Ph.D.
  • 年度 2002
  • 页码 125 p.
  • 总页数 125
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物医学工程;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号