A Polynomial Time Solvable Formulation of Multiple Sequence Alignment

机译：多序列比对的多项式时间可解公式

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Since traditional multiple alignment formulations are NP-hard, heuristics are commonly employed to find acceptable alignments with no guaranteed performance bound. This causes a substantial difficulty in understanding what the resulting alignment means and in assessing the quality of these alignments. We propose an alternative formulation of multiple alignment based on the idea of finding a multiple alignment of k sequences which preserves k -1 pairwise alignments as specified by edges of a given tree. Although it is well known that such a preserving alignment always exists, it did not become a mainstream method for multiple alignment since it seems that a lot of information is lost from ignoring pairwise similarities outside the tree. In contrast, by using pairwise alignments that incorporate consistency information from other sequences, we show that it is possible to obtain very good accuracy with the preserving alignment formulation. We show that a reasonable objective function to use is to find the shortest preserving alignment, and, by a reduction to a graph-theoretic problem, that the problem of finding the shortest preserving multiple alignment can be solved in polynomial time. We demonstrate the success of this approach on three sets of benchmark multiple alignments by using consistency-based pairwise alignments from the first stage of two of the best performing progressive alignment algorithms TCoffee and PROBCONS, and replace the second heuristic progressive step of these algorithms by the exact preserving alignment step (we ignore the iterative refinement step in this study). We apply this strategy to TCoffee and show that the new approach outperforms TCoffee on two of the three test sets. We apply the strategy to a variant of PROBCONS with no iterative refinements and show that the new approach achieves a similar accuracy except on one test set. The most important advantage of the preserving alignment formulation is that we are certain that we can solve the problem in polynomial time without using a heuristic.

机译：由于传统的多重比对公式是NP-hard的，因此通常采用启发式方法来找到可接受的比对，而没有任何性能保证。这在理解所得到的比对意味着什么以及评估这些比对的质量方面造成很大的困难。我们基于发现k个序列的多重比对的想法，提出了一种多重比对的替代方法，该序列可以保留k -1成对比对，如给定树的边缘所指定。尽管众所周知，这样的保留对齐一直存在，但是由于忽略树外的成对相似性似乎丢失了很多信息，因此它并没有成为主流的多重对齐方法。相比之下，通过使用包含来自其他序列的一致性信息的成对比对，我们表明可以通过保留比对公式获得非常好的准确性。我们表明，使用的合理目标函数是找到最短的保留对齐方式，并且通过简化图论问题，可以在多项式时间内解决找到最短的保留多重对齐方式的问题。我们通过使用性能最佳的两个渐进比对算法TCoffee和PROBCONS中的两个的第一阶段的基于一致性的成对比对，证明了该方法在三组基准多重比对中的成功，并用替换了这些算法的第二个启发式渐进步骤。精确保留对齐步骤（在本研究中我们忽略了迭代优化步骤）。我们将此策略应用于TCoffee，并显示出新方法在三个测试集中的两个中均优于TCoffee。我们将该策略应用于没有迭代改进的PROBCONS变体，并表明，除了在一个测试集上，新方法达到了相似的准确性。保留对齐方式公式的最重要优点是，我们确定可以在不使用启发式的情况下解决多项式问题。

著录项

来源
《Annual International Conference on Research in Computational Molecular Biology(RECOMB 2005); 20050514-18; Cambridge,MA(US)》|2005年|P.204-216|共13页
会议地点 CambridgeMA(US)
作者
Sing-Hoi Sze; Yue Lu; Qingwu Yang;
展开▼
作者单位

Department of Computer Science, Texas AM University, College Station, TX 77843, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类生物工程学（生物技术）;
关键词
入库时间 2022-08-26 14:16:34

相似文献

外文文献
中文文献
专利

1. A polynomial time solvable formulation of multiple sequence alignment [J] . Sze SH, Lu Y, Yang QW Journal of computational biology: A journal of computational molecular cell biology . 2006,第2期

机译：多序列比对的多项式时间可解公式
2. Comparing multi-objective metaheuristics for solving a three-objective formulation of multiple sequence alignment [J] . Cristian Zambrano-Vega, Antonio J. Nebro, José García-Nieto, Progress in Artificial Intelligence . 2017,第3期

机译：比较多目标地质学，以解决多序列对齐三目标配方
3. Multiple Sequence Alignment Based on a Suffix Tree and Center-Star Strategy: A Linear Method for Multiple Nucleotide Sequence Alignment on Spark Parallel Framework [J] . Su Wenhe, Liao Xiangke, Lu Yutong, Journal of computational biology: A journal of computational molecular cell biology . 2017,第12期

机译：基于后缀树和中心星策略的多个序列对齐：一种用于火花并联框架上多个核苷酸序列对准的线性方法
4. A Polynomial Time Solvable Formulation of Multiple Sequence Alignment [C] . Sing-Hoi Sze, Yue Lu, Qingwu Yang Annual International Conference on Research in Computational Molecular Biology . 2005

机译：多序列对准的多项式时间可溶性配方
5. ShareMSA: Real Time Collaboration on Multiple Sequence Alignments. [D] . Yunes, Jeffrey. 2017

机译：ShareMSA：多个序列比对的实时协作。
6. RBT-GA: a novel metaheuristic for solving the multiple sequence alignment problem [O] . Javid Taheri, Albert Y Zomaya 2009

机译：RBT-GA：解决多序列比对问题的新型元启发式方法
7. A novel numerical implementation for solving time fractional telegraph differential equations having multiple space and time delays via Delannoy polynomial [O] . Ömür Kıvanç KÜRKÇÜ 2020

机译：一种新的数值实现，用于通过Delannoy多项式解决具有多个空间和时间延迟的时间分数电报微分方程

A Polynomial Time Solvable Formulation of Multiple Sequence Alignment

摘要

著录项

相似文献

相关主题

期刊订阅