首页> 外文会议>Workshop on Algorithm Engineering and Experiments >A Closer Look at the Closest String and Closest Substring Problem
【24h】

A Closer Look at the Closest String and Closest Substring Problem

机译:仔细看看最接近的字符串和最近的子字符问题

获取原文

摘要

Let S be a set of k strings over an alphabet ∑; Each string has a length between l and n. The Closest Substring Problem (CSSP) is to find a minimal integer d (and a corresponding string t of length l) such that each string s ∈ S has a substring of length l with Hamming distance at most d to t. We say t is the closest substring to S. For l = n, this problem is known as the Closest String Problem (CSP). Particularly in computational biology, the CSP and CSSP have found numerous practical applications such as identifying regulatory motifs and approximate gene clusters, and in degenerate primer design. We study ILP formulations for both problems. Our experiments show that a position-based formulation for the CSP performs very well on real-world instances emerging from biology. Even on randomly generated instances that are hard to solve to optimality, solving the root relaxation leads to solutions very close to the optimum. For the CSSP we give a new formulation that is polytope-wise stronger than a straightforward extension of the CSP formulation. Furthermore we propose a strengthening constraint class that speeds up the running time.
机译:让S通过字母σ是一组K字符串;每个字符串的长度在l和n之间。最近的子字符串问题(CSSP)是找到最小整数D(和长度L的相应串T),使得每个弦S∈S具有长度L的子字符串,最多d至t的汉明距离。我们说T是L = N的最接近的子字符串,这个问题被称为最接近的字符串问题(CSP)。特别是在计算生物学中,CSP和CSSP发现了许多实际应用,例如鉴定调节基序和近似基因簇,以及脱底底漆设计。我们研究了两个问题的ILP配方。我们的实验表明,CSP的基于位置的配方在从生物学中出现的真实情况下表现得非常好。即使在随机生成的情况下难以解决最优,求解根部弛豫导致溶液非常接近最佳。对于CSSP,我们给出了一种新的配方,该配方是比CSP配方的直接延伸更强的多孔。此外,我们提出了加强速度的加强约束等级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号