首页> 外文会议>International Symposium on String Processing and Information Retrieval >Weighted Shortest Common Supersequence Problem Revisited
【24h】

Weighted Shortest Common Supersequence Problem Revisited

机译:再论加权最短公共超序列问题

获取原文
获取外文期刊封面目录资料

摘要

A weighted string, also known as a position weight matrix, is a sequence of probability distributions over some alphabet. We revisit the Weighted Shortest Common Supersequence (WSCS) problem, introduced by Amir et al. [SPIRE 2011], that is, the SCS problem on weighted strings. In the WSCS problem, we are given two weighted strings W_1 and W_2 and a threshold - on probability, and we are asked to compute the shortest (standard) string S such that both W_1 and W_2 match subsequences of S (not necessarily the same) with probability at least —. Amir et al. showed that this problem is NP-complete if the probabilities, including the threshold 1/2, are represented by their logarithms (encoded in binary). We present an algorithm that solves the WSCS problem for two weighted strings of length n over a constant-sized alphabet in O(n~2/z log z) time. Notably, our upper bound matches known conditional lower bounds stating that the WSCS problem cannot be solved in O(n~(2~ε)) time or in O*(z~(0.5-ε)) with time, where the O* notation suppresses factors polynomial with respect to the instance size (with numeric values encoded in binary), unless there is a breakthrough improving upon longstanding upper bounds for fundamental NP-hard problems (CNF-SAT and Subset Sum, respectively). We also discover a fundamental difference between the WSCS problem and the Weighted Longest Common Subsequence (WLCS) problem, introduced by Amir et al. [JDA 2010]. We show that the WLCS problem cannot be solved in O(n~(f(z))) time, for any function f(z), unless P = NP.
机译:加权字符串(也称为位置权重矩阵)是某个字母上的概率分布序列。我们将重新讨论由Amir等人提出的加权最短公共超序列(WSCS)问题。 [SPIRE 2011],即加权字符串的SCS问题。在WSCS问题中,我们给了两个加权字符串W_1和W_2以及一个概率阈值,并且要求我们计算最短(标准)字符串S,以使W_1和W_2都匹配S的子序列(不一定相同)至少有概率-。阿米尔(Amir)等。表示如果以概率的对数表示(包括阈值1/2)(用二进制编码),则此问题是NP完全的。我们提出了一种算法,可以在O(n〜2 / z log z)时间内,在恒定大小的字母上解决长度为n的两个加权字符串的WSCS问题。值得注意的是,我们的上限匹配了已知的条件下限,指出WSCS问题无法在O(n〜(2〜ε))时间或O *(z〜(0.5-ε))的时间内解决,其中O *记号会抑制因实例大小(使用二进制编码的数值)而多项式化的因子,除非对基本的NP硬问题(分别为CNF-SAT和Subset Sum)的长期上限有了突破性改进。我们还发现了由Amir等人介绍的WSCS问题和加权最长公共子序列(WLCS)问题之间的根本区别。 [JDA 2010]。我们证明,对于任何函数f(z),除非P = NP,否则WLCS问题无法在O(n〜(f(z)))时间内解决。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号