...
首页> 外文期刊>Journal of Molecular Biology >How Well is Enzyme Function Conserved as a Function of Pairwise Sequence Identity?
【24h】

How Well is Enzyme Function Conserved as a Function of Pairwise Sequence Identity?

机译:酶功能作为成对序列同一性的保守程度如何?

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Enzyme function conservation has been used to derive the threshold of sequence identity necessary to transfer function from a protein of known function to an unknown protein. Using pairwise sequence comparison, several studies suggested that when the sequence identity is above 40%, enzyme function is well conserved. In contrast, Rost argued that because of database bias, the results from such simple pairwise comparisons might be misleading. Thus, by grouping enzyme sequences into families based on sequence similarity and selecting representative sequences for comparison, he showed that enzyme function starts to diverge quickly when the sequence identity is below 70%. Here, we employ a strategy similar to Rost's to reduce the database bias; however, we classify enzyme families based not only on sequence similarity, but also on functional similarity, i.e. sequences in each family must have the same four digits or the same first three digits of the enzyme commission (EC) number. Furthermore, instead of selecting representative sequences for comparison, we calculate the function conservation of each enzyme family and then average the degree of enzyme function conservation across all enzyme families. Our analysis suggests that for functional transferability, 40% sequence identity can still be used as a confident threshold to transfer the first three digits of an EC number; however, to transfer all four digits of an EC number, above 60% sequence identity is needed to have at least 90% accuracy. Moreover, when PSI-BLAST is used, the magnitude of the E-value is found to be weakly correlated with the extent of enzyme function conservation in the third iteration of PSI-BLAST. As a result, functional annotation based on the E-values from PSI-BLAST should be used with caution. We also show that by employing an enzyme family-specific sequence identity threshold above which 100% functional conservation is required, functional inference of unknown sequences can be accurately accomplished. However, this comes at a cost:those true positive sequences below this threshold cannot be uniquely identified.
机译:酶功能保守已被用于推导将功能从已知功能的蛋白质转移至未知蛋白质所需的序列同一性阈值。使用成对的序列比较,一些研究表明,当序列同一性高于40%时,酶的功能将得到很好的保守。相反,Rost认为由于数据库的偏见,这种简单的成对比较结果可能会产生误导。因此,通过基于序列相似性将酶序列分组为家族,并选择代表性序列进行比较,他表明,当序列同一性低于70%时,酶功能开始迅速分化。在这里,我们采用类似于Rost的策略来减少数据库偏差。但是,我们不仅根据序列相似性,而且还根据功能相似性对酶家族进行分类,即每个家族中的序列必须具有相同的四位数或相同的前三位数的佣金(EC)。此外,代替选择代表性序列进行比较,我们计算每个酶家族的功能保守性,然后平均所有酶家族中酶功能保守性的程度。我们的分析表明,对于功能可转移性,仍然可以将40%的序列同一性用作转移EC编号的前三位的可信阈值;但是,要传输EC编号的所有四位数字,需要60%以上的序列同一性至少具有90%的准确性。此外,当使用PSI-BLAST时,在PSI-BLAST的第三次迭代中,发现E值的大小与酶功能的保守程度弱相关。因此,应谨慎使用基于来自PSI-BLAST的E值的功能注释。我们还表明,通过使用特定于酶家族的序列同一性阈值,高于该阈值需要100%的功能保守性,可以准确实现未知序列的功能推断。然而,这是有代价的:低于该阈值的那些真实正序列不能被唯一地识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号