【24h】

A Polynomial Time Solvable Formulation of Multiple Sequence Alignment

机译:多序列比对的多项式时间可解公式

获取原文
获取原文并翻译 | 示例

摘要

Since traditional multiple alignment formulations are NP-hard, heuristics are commonly employed to find acceptable alignments with no guaranteed performance bound. This causes a substantial difficulty in understanding what the resulting alignment means and in assessing the quality of these alignments. We propose an alternative formulation of multiple alignment based on the idea of finding a multiple alignment of k sequences which preserves k -1 pairwise alignments as specified by edges of a given tree. Although it is well known that such a preserving alignment always exists, it did not become a mainstream method for multiple alignment since it seems that a lot of information is lost from ignoring pairwise similarities outside the tree. In contrast, by using pairwise alignments that incorporate consistency information from other sequences, we show that it is possible to obtain very good accuracy with the preserving alignment formulation. We show that a reasonable objective function to use is to find the shortest preserving alignment, and, by a reduction to a graph-theoretic problem, that the problem of finding the shortest preserving multiple alignment can be solved in polynomial time. We demonstrate the success of this approach on three sets of benchmark multiple alignments by using consistency-based pairwise alignments from the first stage of two of the best performing progressive alignment algorithms TCoffee and PROBCONS, and replace the second heuristic progressive step of these algorithms by the exact preserving alignment step (we ignore the iterative refinement step in this study). We apply this strategy to TCoffee and show that the new approach outperforms TCoffee on two of the three test sets. We apply the strategy to a variant of PROBCONS with no iterative refinements and show that the new approach achieves a similar accuracy except on one test set. The most important advantage of the preserving alignment formulation is that we are certain that we can solve the problem in polynomial time without using a heuristic.
机译:由于传统的多重比对公式是NP-hard的,因此通常采用启发式方法来找到可接受的比对,而没有任何性能保证。这在理解所得到的比对意味着什么以及评估这些比对的质量方面造成很大的困难。我们基于发现k个序列的多重比对的想法,提出了一种多重比对的替代方法,该序列可以保留k -1成对比对,如给定树的边缘所指定。尽管众所周知,这样的保留对齐一直存在,但是由于忽略树外的成对相似性似乎丢失了很多信息,因此它并没有成为主流的多重对齐方法。相比之下,通过使用包含来自其他序列的一致性信息的成对比对,我们表明可以通过保留比对公式获得非常好的准确性。我们表明,使用的合理目标函数是找到最短的保留对齐方式,并且通过简化图论问题,可以在多项式时间内解决找到最短的保留多重对齐方式的问题。我们通过使用性能最佳的两个渐进比对算法TCoffee和PROBCONS中的两个的第一阶段的基于一致性的成对比对,证明了该方法在三组基准多重比对中的成功,并用替换了这些算法的第二个启发式渐进步骤。精确保留对齐步骤(在本研究中我们忽略了迭代优化步骤)。我们将此策略应用于TCoffee,并显示出新方法在三个测试集中的两个中均优于TCoffee。我们将该策略应用于没有迭代改进的PROBCONS变体,并表明,除了在一个测试集上,新方法达到了相似的准确性。保留对齐方式公式的最重要优点是,我们确定可以在不使用启发式的情况下解决多项式问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号