【24h】

Consequences of Faster Alignment of Sequences

机译:序列比对更快的结果

获取原文

摘要

The Local Alignment problem is a classical problem with applications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local Alignment run in time that is roughly quadratic in the string length. It is a big open problem whether substantially subquadratic algorithms exist. In this paper we show that for all ε > 0, an O(n~(2-ε)) time algorithm for Local Alignment on strings of length n would imply breakthroughs on three longstanding open problems: it would imply that for some δ > 0, 3SUM on n numbers is in O(n~(2-δ)) time, CNF-SAT on n variables is in O((2 - δ)~n) time, and Max Weight 4-Clique is in O(n~(4~δ)) time. Our result for CNF-SAT also applies to the easier problem of finding the longest common substring of binary strings with don't cares. We also give strong conditional lower bounds for the more general Multiple Local Alignment problem on k strings, under both k-wise and SP scoring, and for other string similarity problems such as Global Alignment with gap penalties and normalized Longest Common Subsequence.
机译:局部比对问题是生物学中的经典问题。给定两个输入字符串和一个成对字母的计分功能,要求一个查找在计分功能下最相似的两个输入字符串的子字符串。最佳的本地对齐算法运行时间大约是字符串长度的二次方。是否存在实质上次二次算法是一个大的开放问题。在本文中,我们表明,对于所有ε> 0,长度为n的字符串的局部对齐的O(n〜(2-ε))时间算法将暗示对三个长期存在的开放问题的突破:这意味着对于某些δ> 0,n个数字的3SUM以O(n〜(2-δ))时间为单位,n个变量的CNF-SAT以O((2-δ)〜n)时间为单位,最大权重4-Clique以O(n n〜(4〜δ))时间。我们针对CNF-SAT的结果也适用于以下问题:查找二进制字符串的最长公共子字符串,而无需关心。我们还针对k字符串和k得分上更普遍的多重局部对齐问题以及其他字符串相似性问题(例如带空位罚分的全局对齐和规范化的最长公共子序列)给出了较强的条件下界,以解决k字符串上更普遍的多重局部对齐问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号