首页> 美国卫生研究院文献>Bioinformatics >How significant is a protein structure similarity with TM-score = 0.5?
【2h】

How significant is a protein structure similarity with TM-score = 0.5?

机译:TM分数= 0.5的蛋白质​​结构相似性有多重要?

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Protein structure similarity is often measured by root mean squared deviation, global distance test score and template modeling score (TM-score). However, the scores themselves cannot provide information on how significant the structural similarity is. Also, it lacks a quantitative relation between the scores and conventional fold classifications. This article aims to answer two questions: (i) what is the statistical significance of TM-score? (ii) What is the probability of two proteins having the same fold given a specific TM-score?>Results: We first made an all-to-all gapless structural match on 6684 non-homologous single-domain proteins in the PDB and found that the TM-scores follow an extreme value distribution. The data allow us to assign each TM-score a P-value that measures the chance of two randomly selected proteins obtaining an equal or higher TM-score. With a TM-score at 0.5, for instance, its P-value is 5.5 × 10−7, which means we need to consider at least 1.8 million random protein pairs to acquire a TM-score of no less than 0.5. Second, we examine the posterior probability of the same fold proteins from three datasets SCOP, CATH and the consensus of SCOP and CATH. It is found that the posterior probability from different datasets has a similar rapid phase transition around TM-score=0.5. This finding indicates that TM-score can be used as an approximate but quantitative criterion for protein topology classification, i.e. protein pairs with a TM-score >0.5 are mostly in the same fold while those with a TM-score <0.5 are mainly not in the same fold.>Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:蛋白质结构相似性通常通过均方根偏差,整体距离测试得分和模板建模得分(TM得分)来衡量。但是,分数本身无法提供有关结构相似性的重要性的信息。而且,它在得分和常规的折叠分类之间缺乏定量关系。本文旨在回答两个问题:(i)TM得分的统计意义是什么? (ii)在给定特定的TM得分的情况下,两种蛋白质具有相同倍数的概率是多少?>结果:我们首先在6684个非同源单域上进行了全部到全部的无间隙结构匹配PDB中的蛋白质,发现TM分数遵循极值分布。数据使我们可以为每个TM得分分配一个P值,该P值测量两种随机选择的蛋白质获得相等或更高TM得分的机会。例如,以TM得分为0.5时,其P值为5.5×10 −7 ,这意味着我们需要考虑至少180万随机蛋白质对才能获得无小于0.5。其次,我们检查了来自三个数据集SCOP,CATH以及SCOP和CATH共识的相同折叠蛋白的后验概率。发现来自不同数据集的后验概率在TM-score = 0.5附近具有相似的快速相变。这一发现表明,TM-分数可以用作蛋白质拓扑分类的一种近似但定量的标准,即,TM-分数> 0.5的蛋白质​​对大多处于同一倍,而TM-分数<0.5的蛋白质​​对主要不在。 >联系方式:>补充信息:可在Bioinformatics在线获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号