...
首页> 外文期刊>Journal of combinatorial optimization >An Approximation Algorithm for Alignment of Multiple Sequences using Motif Discovery
【24h】

An Approximation Algorithm for Alignment of Multiple Sequences using Motif Discovery

机译:使用主题发现的多序列比对的近似算法

获取原文
获取原文并翻译 | 示例
           

摘要

Given a set of N sequence, the Multiple Sequence Alignment problem is to align these N sequences. possibly with gaps, that brings out the best commonality of the N sequences. The quality of the alignment is usually measured by penalizing the mis-matches and gaps, and rewarding the matches with appropriate weight functions.However for larger values of N, additional constraints are required to give meaningful alignments. We identify a user-controlled parameter, an alignment number K (2<=K<= N):this additional requirement constrains the alignment to have at least K sequences agree on a character. whenever possible, in the alignment. We identify a natural optimization problem for this approach called the K-MAS problem. We show that the problem is MAX SNP hard. We give a natural extension of this problem that incorporates "biological relevance" by using motifs (common patterns in the sequences)and give an approximation algorithm for this problem in terms of the motifs in the data. MUSCA is an implementation of this approach and our experimental results indicate that this approach is efficient, particularly on large numbers of long sequences, and gives good alignments when tested on biological data such as DNA and protein sequences.
机译:给定一组N个序列,多重序列比对问题就是对这N个序列进行比对。可能存在缺口,这带来了N个序列的最佳共性。对齐的质量通常是通过对错配和缺口进行惩罚并用适当的权重函数奖励匹配来进行的。但是,对于较大的N值,还需要附加约束来给出有意义的对齐。我们确定一个用户控制的参数,比对数K(2 <= K <= N):此附加要求将比对限制为至少有K个序列在一个字符上一致。尽可能对齐。我们为这种方法确定了一个自然优化问题,称为K-MAS问题。我们证明问题是MAX SNP很难解决。我们通过使用基序(序列中的常见模式)对包含“生物学相关性”的问题进行了自然扩展,并根据数据中的基序给出了针对该问题的近似算法。 MUSCA是此方法的一种实现,我们的实验结果表明,该方法是有效的,尤其是在大量长序列上,并且在对生物学数据(例如DNA和蛋白质序列)进行测试时,可以实现良好的比对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号