首页> 外文会议>Annual Symposium on Combinatorial Pattern Matching(CPM 2006); 20060705-07; Barcelona(ES) >Efficient Algorithms for Regular Expression Constrained Sequence Alignment
【24h】

Efficient Algorithms for Regular Expression Constrained Sequence Alignment

机译:正则表达式约束序列比对的高效算法

获取原文
获取原文并翻译 | 示例

摘要

Imposing constraints is an effective means to incorporate biological knowledge into alignment procedures. As in the PROSITE database, functional sites of proteins can be effectively described as regular expressions. In an alignment of protein sequences it is natural to expect that functional motifs should be aligned together. Due to this motivation, in CPM 2005 Arslan introduced the regular expression constrained sequence alignment problem and proposed an algorithm which can take time and space up to O(|∑|~2 |V|~4n~2) and O(|∑|~2 |V|~4n), respectively, where ∑ is the alphabet, n is the sequence length, and V is the set of states in an automaton equivalent to the input regular expression. In this paper we propose a more efficient algorithm solving this problem which takes O(|V|~3 n~2) time and O(|V|~2 n) space in the worst case. If |V| = O(log n) we propose another algorithm with time complexity O(|V|~2 log |V| n~2). The time complexity of our algorithms is independent of ∑, which is desirable in protein applications where the formulation of this problem originates; a factor of |∑|~2 = 400 in the time complexity of the previously proposed algorithm would significantly affect the efficiency in practice.
机译:施加约束是将生物学知识整合到比对程序中的有效手段。像在PROSITE数据库中一样,蛋白质的功能位点可以有效地描述为正则表达式。在蛋白质序列的比对中,自然希望功能基序应该一起比对。由于这种动机,Arslan在CPM 2005中引入了正则表达式约束序列比对问题,并提出了一种算法,该算法可以占用最多O(| ∑ |〜2 | V |〜4n〜2)和O(| ∑ | 〜2 | V |〜4n),其中∑是字母,n是序列长度,V是自动机中与输入正则表达式等效的状态集。在本文中,我们提出了一种更有效的算法来解决此问题,该算法在最坏的情况下占用O(| V |〜3 n〜2)时间和O(| V |〜2 n)空间。如果| V | = O(log n)我们提出了另一种算法,其时间复杂度为O(| V |〜2 log | V | n〜2)。我们算法的时间复杂度与∑无关,这在产生此问题的蛋白质应用中是合乎需要的。在先前提出的算法的时间复杂度中,| ∑ |〜2 = 400的因子将在实践中显着影响效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号