首页> 外文会议>21st International Conference on Genome Informatics. >RecMotif: a novel fast algorithm for weak motif discovery
【24h】

RecMotif: a novel fast algorithm for weak motif discovery

机译:RecMotif:弱主题发现的新型快速算法

获取原文
获取原文并翻译 | 示例

摘要

Background: Weak motif discovery in DNA sequences is an important but unresolved problem in computational biology. Previous algorithms that aimed to solve the problem usually require a large amount of memory or execution time. In this paper, we proposed a fast and memory efficient algorithm, RecMotif, which guarantees to discover all motifs with specific (l, d) settings (where l is the motif length and d is the maximum number of mutations between a motif instance and the true motif). Results: Comparisons with several recently proposed algorithms have shown that RecMotif is more scalable for handling longer and weaker motifs. For instance, it can solve the open challenge cases such as (40, 14) within 5 hours while the other algorithms compared failed due to either longer execution times or shortage of memory space. For real biological sequences, such as E.coli CRP, RecMotif is able to accurately discover the motif instances with (l, d) as (18, 6) in less than 1 second, which is faster than the other algorithms compared. Conclusions: RecMotif is a novel algorithm that requires only a space complexity of O(m2n) (where m is the number of sequences in the data and n is the length of the sequences).
机译:背景:DNA序列中的弱基序发现是计算生物学中一个重要但尚未解决的问题。旨在解决该问题的先前算法通常需要大量的内存或执行时间。在本文中,我们提出了一种快速且存储效率高的算法RecMotif,该算法可确保发现具有特定(l,d)设置的所有主题(其中l是主题长度,d是主题实例与模板之间最大的突变数)。真实主题)。结果:与最近提出的几种算法的比较表明,RecMotif具有更好的可扩展性,可以处理更长和更弱的图案。例如,它可以在5小时内解决诸如(40,14)之类的公开挑战案例,而其他比较算法由于执行时间较长或内存空间不足而失败。对于诸如E.coli CRP的真实生物序列,RecMotif能够在不到1秒的时间内准确发现带有(l,d)为(18,6)的基序实例,这比其他算法要快。结论:RecMotif是一种新颖的算法,仅需要O(m2n)的空间复杂度(其中m是数据中的序列数,n是序列的长度)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号