【24h】

New Bounds for Motif Finding in Strong Instances

机译:在强实例中发现母题的新界限

获取原文
获取原文并翻译 | 示例

摘要

Many algorithms for motif finding that are commonly used in bioinformatics start by sampling r potential motif occurrences from n input sequences. The motif is derived from these samples and evaluated on all sequences. This approach works extremely well in practice, and is implemented by several programs. Li, Ma and Wang have shown that a simple algorithm of this sort is a polynomial-time approximation scheme. However, in 2005, we showed specific instances of the motif finding problem for which the approximation ratio of a slight variation of this scheme converges to one very slowly as a function of the sample size r, which seemingly contradicts the high performance of sample-based algorithms. Here, we account for the difference by showing that, for a variety of different definitions of "strong" binary motifs, the approximation ratio of sample-based algorithms converges to one exponentially fast in r. We also describe "very strong" motifs, for which the simple sample-based approach always identifies the correct motif, even for modest values of r.
机译:生物信息学中常用的许多主题查找算法都是从n个输入序列中采样r个潜在的主题出现开始的。基序来自这些样品,并在所有序列上进行评估。这种方法在实践中非常有效,并且由多个程序实现。李,马和王已经表明,这种简单的算法是多项式时间近似方案。但是,在2005年,我们展示了特定的图案发现问题实例,该方案的细微变化的近似比率随样本大小r的变化非常缓慢地收敛到一个,这似乎与基于样本的高性能相矛盾。算法。在这里,我们通过显示对于“强”二进制基元的各种不同定义来说明差异,基于样本的算法的近似比率在r中收敛为指数级快速。我们还描述了“非常强”的基序,即使基于r的适度值,基于样本的简单方法也始终可以识别出正确的基序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号