Anytime algorithms for the longest common palindromic subsequence problem


获取原文并翻译 | 示例


The longest common palindromic subsequence (LCPS) problem aims at finding a longest string that appears as a subsequence in each of a set of input strings and is a palindrome at the same time. The problem is a special variant of the well known longest common subsequence problem and has applications in particular in genomics and biology, where strings correspond to DNA or protein sequences and similarities among them shall be detected or quantified. We first present a more traditional A* search that makes use of an advanced upper bound calculation for partial solutions. This exact approach works well for instances with two input strings and, as shown in experiments, outperforms several other exact methods from the literature. However, the A* search also has natural limitations when a larger number of strings shall be considered due to the problem's complexity. To effectively deal with this case in practice, anytime A* search variants are investigated, which are able to return a reasonable heuristic solution at almost any time and are expected to find better and better solutions until reaching a proven optimum when enough time given. In particular a novel approach is proposed in which Anytime Column Search (ACS) is interleaved with traditional A* node expansions. The ACS iterations are guided by a new heuristic function that approximates the expected length of an LCPS in subproblems usually much better than the available upper bound calculation. This A*+ACS hybrid is able to solve small to medium-sized LCPS instances to proven optimality while returning good heuristic solutions together with upper bounds for large instances. In rigorous experimental evaluations we compare A*+ACS to several other anytime A* search variants and observe its superiority. (C) 2019 Elsevier Ltd. All rights reserved.
机译:最长的通用回文子序列(LCPS)问题旨在找到最长的字符串,该字符串在每个输入字符串集中都作为子序列出现,并且同时是回文。该问题是众所周知的最长公共子序列问题的特殊变体,特别是在基因组和生物学中具有应用,其中字符串对应于DNA或蛋白质序列,并且其中的相似性应被检测或定量。我们首先提出一个更传统的A *搜索,该搜索使用高级的上限计算来求解部分解决方案。这种精确的方法适用于具有两个输入字符串的实例,并且如实验所示,其性能优于文献中的其他几种精确方法。但是,由于问题的复杂性,当应考虑使用更多的字符串时,A *搜索也具有自然的局限性。为了在实践中有效地处理这种情况,我们会随时研究A *搜索变体,这些变体几乎可以在任何时间返回合理的启发式解决方案,并且有望找到越来越好的解决方案,直到有足够的时间达到公认的最佳解决方案。特别是提出了一种新颖的方法,其中随时列搜索(ACS)与传统的A *节点扩展交错。 ACS迭代由一个新的启发式函数指导,该函数近似于子问题中LCPS的预期长度,通常比可用的上限计算要好得多。这种A * + ACS混合体能够解决中小型LCPS实例达到公认的最优性,同时返回良好的启发式解决方案以及大型实例的上限。在严格的实验评估中,我们将A * + ACS与其他任何时候的A *搜索变体进行了比较,并观察了其优越性。 (C)2019 Elsevier Ltd.保留所有权利。



  • 外文文献
  • 中文文献
  • 专利
  • 写作辅导
  • 期刊发表


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号