首页> 外文期刊>Journal of Molecular Biology >Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.
【24h】

Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.

机译:评估基因组学的注释转移:通过传统和概率评分量化蛋白质序列,结构和功能之间的关系。

获取原文
获取原文并翻译 | 示例
           

摘要

Measuring in a quantitative, statistical sense the degree to which structural and functional information can be "transferred" between pairs of related protein sequences at various levels of similarity is an essential prerequisite for robust genome annotation. To this end, we performed pairwise sequence, structure and function comparisons on approximately 30,000 pairs of protein domains with known structure and function. Our domain pairs, which are constructed according to the SCOP fold classification, range in similarity from just sharing a fold, to being nearly identical. Our results show that traditional scores for sequence and structure similarity have the same basic exponential relationship as observed previously, with structural divergence, measured in RMS, being exponentially related to sequence divergence, measured in percent identity. However, as the scale of our survey is much larger than any previous investigations, our results have greater statistical weight and precision. We have been able to express the relationship of sequence and structure similarity using more "modern scores," such as Smith-Waterman alignment scores and probabilistic P-values for both sequence and structure comparison. These modern scores address some of the problems with traditional scores, such as determining a conserved core and correcting for length dependency; they enable us to phrase the sequence-structure relationship in more precise and accurate terms. We found that the basic exponential sequence-structure relationship is very general: the same essential relationship is found in the different secondary-structure classes and is evident in all the scoring schemes. To relate function to sequence and structure we assigned various levels of functional similarity to the domain pairs, based on a simple functional classification scheme. This scheme was constructed by combining and augmenting annotations in the enzyme and fly functional classifications and comparing subsets of these to the Escherichia coli and yeast classifications. We found sigmoidal relationships between similarity in function and sequence, with clear thresholds for different levels of functional conservation. For pairs of domains that share the same fold, precise function appears to be conserved down to approximately 40 % sequence identity, whereas broad functional class is conserved to approximately 25 %. Interestingly, percent identity is more effective at quantifying functional conservation than the more modern scores (e.g. P-values). Results of all the pairwise comparisons and our combined functional classification scheme for protein structures can be accessed from a web database at http://bioinfo.mbb.yale.edu/alignCopyright 2000 Academic Press.
机译:在定量的统计意义上测量结构和功能信息在成对的相关蛋白质序列对之间以各种相似水平“转移”的程度,是进行可靠的基因组注释的必要先决条件。为此,我们对大约30,000对具有已知结构和功能的蛋白质结构域进行了成对的序列,结构和功能比较。我们的域对是根据SCOP折叠分类构造的,其相似程度从共享一个折叠到几乎完全相同。我们的结果表明,序列和结构相似性的传统评分与以前观察到的基本指数关系相同,以RMS度量的结构差异与以同一性百分比度量的序列差异呈指数关系。但是,由于我们的调查规模比以前的任何调查都要大得多,因此我们的结果具有更大的统计权重和准确性。我们已经能够使用更多的“现代分数”来表达序列和结构相似性的关系,例如史密斯-沃特曼比对分数和用于序列和结构比较的概率P值。这些现代分数解决了传统分数的一些问题,例如确定保守的核心和校正长度依赖性。它们使我们能够以更精确和更准确的术语表达序列-结构关系。我们发现基本的指数序列-结构关系非常笼统:在不同的二级结构类中发现相同的基本关系,并且在所有计分方案中都很明显。为了将功能与序列和结构相关联,我们基于简单的功能分类方案将不同级别的功能相似性分配给域对。通过组合和增加酶和果蝇功能分类中的注释,并将这些注释的子集与大肠杆菌和酵母分类进行比较,来构建此方案。我们发现功能和序列相似性之间存在S形关系,对于不同级别的功能保守性具有明确的阈值。对于具有相同折叠的结构域对,精确功能似乎可保守到大约40%的序列同一性,而宽泛的功能类别则保守到大约25%。有趣的是,同一性百分比在量化功能保守性方面比更现代的分数(例如P值)更有效。所有成对比较的结果以及我们针对蛋白质结构的组合功能分类方案都可以从Web数据库访问,网址为http://bioinfo.mbb.yale.edu/alignCopyright 2000 Academic Press。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号