首页> 外文会议>International Conference on String Processing and Information Retrieval >New Algorithms for Finding Monad Patterns in DNA Sequences
【24h】

New Algorithms for Finding Monad Patterns in DNA Sequences

机译:用于在DNA序列中查找Monad模式的新算法

获取原文

摘要

In this paper, we present two new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l,d)-k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(nt~2 l~d σ~d), where t is the number of input sequences, n is the length of each input sequence, and σ = |Σ| is the size of the alphabet. The first algorithm that we present in this paper takes O(n~2 t~2 l~(d/2)) time and O(ntl~(d/2) σ~(d/2)) space, and the second algorithm takes O(n~3 t~3 l~(d/2) σ~(d/2)) time using O(l~(d/2) σ~(d/2)) space. In practice, our algorithms have much better performance provided the d/l ratio is small. The second algorithm performs very well even for large values l and d as long as the d/l ratio is small.
机译:在本文中,我们提出了两个用于在DNA序列中发现Monad模式的新算法。 Monad图案是形式(L,D)-K,其中L是图案的长度,D是允许的最大不匹配数,K是在给定样品中重复模式的最小次数。迄今为止一些最知名的算法的时间复杂性是O(nt〜2 l〜dσ〜d),其中t是输入序列的数量,n是每个输入序列的长度,σ= |σ |是字母表的大小。我们在本文中存在的第一算法需要O(n〜2 t〜2 l〜(d / 2))时间和o(ntl〜(d / 2)σ〜(d / 2))空间,第二个算法采用O(n〜3 t〜3 l〜(d / 2)σ〜(d / 2))时间使用o(l〜(d / 2)σ〜(d / 2))空间。在实践中,我们的算法提供了更好的性能,提供了D / L比例很小。即使对于大值L和D,第二算法也可以非常好,只要D / L比例很小即可。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号