...
首页> 外文期刊>Algorithms for Molecular Biology >Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains
【24h】

Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains

机译:用马尔可夫链同时对准和折叠RNA的快速准确结构概率估计

获取原文
           

摘要

Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solves the theoretical problem exactly with a complexity of $$O(n^6)$$ in the full energy model. Over the last two decades, several variants and improvements of the Sankoff algorithm have been proposed to reduce its extreme complexity by proposing simplified energy models or imposing restrictions on the predicted alignments. Here, we introduce a novel variant of Sankoff’s algorithm that reconciles the simplifications of PMcomp, namely moving from the full energy model to a simpler base pair-based model, with the accuracy of the loop-based full energy model. Instead of estimating pseudo-energies from unconditional base pair probabilities, our model calculates energies from conditional base pair probabilities that allow to accurately capture structure probabilities, which obey a conditional dependency. This model gives rise to the fast and highly accurate novel algorithm Pankov (Probabilistic Sankoff-like simultaneous alignment and folding of RNAs inspired by Markov chains). Pankov benefits from the speed-up of excluding unreliable base-pairing without compromising the loop-based free energy model of the Sankoff’s algorithm. We show that Pankov outperforms its predecessors LocARNA and SPARSE in folding quality and is faster than LocARNA.
机译:RNA的同时对准和折叠(SA&F)是用于推断非编码RNA结构的不可缺少的金标准及其一般性分析。由Sankoff提出的原始算法,解决了完整能量模型中的$$ o(n ^ 6)$$的复杂性的理论问题。在过去的二十年中,已经提出了通过提出简化的能量模型或对预测对准的限制来降低其Sankoff算法的若干变体和改进来降低其极端复杂性。在这里,我们介绍了Sankoff的算法的新型变体,该算法调和了PMComp的简化,即从完整能量模型转移到基于更简单的基础对的模型,以基于环路的全能量模型的准确性。我们的模型而不是从无条件的基对概率估计伪能量,而是从条件基对概率计算精力,允许准确地捕获结构概率,这遵循条件依赖性。该模型引发了快速且高度准确的新型算法Pankov(Propabilistic Sankoff-like的同时对准和由Markov链的启发的RNA)。 Pankov从排除不可靠的基础配对的加速时益处,而不会影响Sankoff的算法的循环自由能模型。我们表明Pankov在折叠质量的折叠质量方面优于其前辈·洛迦纳,稀疏,比LOCARNA更快。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号