首页> 外文期刊>BMC Evolutionary Biology >Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics
【24h】

Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics

机译:在没有系统发育树的情况下检测共进化?协同进化的无树度量指标与树感知指标一样好

获取原文
           

摘要

Background Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance. Results Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical. Conclusion The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.
机译:背景技术鉴定蛋白质序列中共同进化的位置具有广泛的应用,范围从理解和预测单个分子的结构到生成蛋白质组范围的相互作用预测。用于检测共同进化位置的算法可以分为两类:感知树的知识(结合了系统发育知识)和不了解树的知识(不包含进化知识)。忽略树的方法通常快几个数量级,但是由于共同祖先与协同进化的混淆而被广泛认为不够准确。我们推测,通过使用适当控制共享祖先信号的空分布,无树方法将表现出与树感知方法相同的统计能力。使用协进化指标的新型t检验转换,我们系统地比较了四种识别树的知识和五种无树知识的协同进化算法,并将它们应用于肌红蛋白和肌球蛋白。我们进一步考虑了使用还原态氨基酸字母进行序列编码的影响,这是在协同进化分析中用来改善统计和计算性能的一种常用策略。结果与我们的猜想一致,转换后的无树度量(尤其是互信息)通常胜过树感知度量。我们对重新编码效果的检查表明,基于电荷的字母通常可更好地识别alpha螺旋中的稳定相互作用。但是,通过重新编码并不能始终提高性能,这表明字母的选择至关重要。结论结果表明,无树度量的t检验转换足以控制共享祖先的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号