...
首页> 外文期刊>Bioinformatics >CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score
【24h】

CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score

机译:CentroidAlign:通过最大化预期的配对对得分,对结构化RNA进行快速,准确的比对

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: The importance of accurate and fast predictions of multiple alignments for RNA sequences has increased due to recent findings about functional non-coding RNAs. Recent studies suggest that maximizing the expected accuracy of predictions will be useful for many problems in bioinformatics.Results: We designed a novel estimator for multiple alignments of structured RNAs, based on maximizing the expected accuracy of predictions. First, we define the maximum expected accuracy (MEA) estimator for pairwise alignment of RNA sequences. This maximizes the expected sum-of-pairs score (SPS) of a predicted alignment under a probability distribution of alignments given by marginalizing the Sankoff model. Then, by approximating the MEA estimator, we obtain an estimator whose time complexity is O(L-3 + c(2)dL(2)) where L is the length of input sequences and both c and d are constants independent of L. The proposed estimator can handle uncertainty of secondary structures and alignments that are obstacles in Bioinformatics because it considers all the secondary structures and all the pairwise alignments as input sequences. Moreover, we integrate the probabilistic consistency transformation (PCT) on alignments into the proposed estimator. Computational experiments using six benchmark datasets indicate that the proposed method achieved a favorable SPS and was the fastest of many state-of-the-art tools for multiple alignments of structured RNAs.
机译:动机:由于有关功能性非编码RNA的最新发现,准确,快速地预测RNA序列多重比对的重要性日益增加。最近的研究表明,最大化预期的预测准确性将对生物信息学中的许多问题有用。结果:我们在最大化预期的预测准确性的基础上,针对结构化RNA的多重比对设计了一种新颖的估计器。首先,我们定义了RNA序列的成对比对的最大预期准确性(MEA)估计器。这在通过边缘化Sankoff模型而给出的比对的概率分布下,使预测的比对的预期对对总分(SPS)最大化。然后,通过近似MEA估计器,我们获得时间复杂度为O(L-3 + c(2)dL(2))的估计器,其中L是输入序列的长度,并且c和d都是独立于L的常数。所提出的估计器可以处理生物信息学中障碍的二级结构和比对的不确定性,因为它将所有二级结构和所有成对比对都视为输入序列。此外,我们将比对上的概率一致性转换(PCT)整合到了拟议的估算器中。使用六个基准数据集的计算实验表明,所提出的方法获得了良好的SPS,并且是许多用于结构化RNA多重比对的先进工具中最快的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号