...
首页> 外文期刊>BMC Bioinformatics >Non-Markovian effects on protein sequence evolution due to site dependent substitution rates
【24h】

Non-Markovian effects on protein sequence evolution due to site dependent substitution rates

机译:由于位点依赖的取代率对蛋白质序列进化的非马尔可夫效应

获取原文
           

摘要

Background Many models of protein sequence evolution, in particular those based on Point Accepted Mutation (PAM) matrices, assume that its dynamics is Markovian. Nevertheless, it has been observed that evolution seems to proceed differently at different time scales, questioning this assumption. In 2011 Kosiol and Goldman proved that, if evolution is Markovian at the codon level, it can not be Markovian at the amino acid level. However, it remains unclear up to which point the Markov assumption is verified at the codon level. Results Here we show how also the among-site variability of substitution rates makes the process of full protein sequence evolution effectively not Markovian even at the codon level. This may be the theoretical explanation behind the well known systematic underestimation of evolutionary distances observed when omitting rate variability. If the substitution rate variability is neglected the average amino acid and codon replacement probabilities are affected by systematic errors and those with the largest mismatches are the substitutions involving more than one nucleotide at a time. On the other hand, the instantaneous substitution matrices estimated from alignments with the Markov assumption tend to overestimate double and triple substitutions, even when learned from alignments at high sequence identity. Conclusions These results discourage the use of simple Markov models to describe full protein sequence evolution and encourage to employ, whenever possible, models that account for rate variability by construction (such as hidden Markov models or mixture models) or substitution models of the type of Le and Gascuel (2008) that account for it explicitly.
机译:背景技术蛋白质序列进化的许多模型,特别是基于点接受突变(PAM)矩阵的模型,都假定其动力学是马尔可夫模型。然而,已经观察到,进化似乎在不同的时间尺度上以不同的方式进行,这质疑了这一假设。在2011年,Kosiol和Goldman证明,如果进化在密码子水平上是马尔可夫,就不可能在氨基酸水平上是马尔可夫。但是,目前尚不清楚在密码子水平上可以验证到马尔可夫假设的那一点。结果在这里,我们显示了置换率的位点间变异性如何也使有效的全蛋白质序列进化过程,即使在密码子水平上,也不是马尔可夫序列。这可能是在忽略速率变化时观察到的进化距离众所周知的系统低估背后的理论解释。如果忽略了取代率的变异性,则平均氨基酸和密码子替换概率会受到系统误差的影响,而错配最大的是一次包含一个以上核苷酸的取代。另一方面,即使是从具有较高序列同一性的比对中获悉,从利用马尔可夫假设进行的比对中估计的瞬时替换矩阵也往往高估了两次和三次置换。结论这些结果不鼓励使用简单的马尔可夫模型来描述完整的蛋白质序列进化,并鼓励在可能的情况下采用构建模型来解释速率变化的模型(例如隐马尔可夫模型或混合物模型)或Le类型的替代模型。和Gascuel(2008)明确地解释了这一点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号