A Novel Greedy Algorithm for the Minimum Common String Partition Problem

机译：最小公共字符串分配问题的新型贪婪算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Minimum Common String Partition problem (MCSP) is to partition two given input strings into the same collection of substrings, where the number of substrings in the partition is minimized. This problem is a key problem in genome rearrangement, and is closely related to the problem of sorting by reversals with duplicates. MCSP is NP-hard, even for the most trivial case, 2-MCSP, where each letter occurs at most twice in each input string. There are various approximation algorithms which can achieve very good approximation ratios but with complicated implementations, for example, 1.5-approximation algorithm for 2-MCSP, 1.1037-approximation algorithm for 2-MCSP and a 4-approximation algorithm for 3-MCSP. There is also a simple greedy algorithm for MCSP which extracts the longest common substring from the given strings at each step. In this paper, we propose a novel greedy algorithm for MCSP, where we extract the longest common substring containing a symbol occurring only once at each step whenever there is a such symbol. We show our algorithm is more "worst case" greedy at each step than the greedy algorithm and the expected performance of our algorithm is better than that of the greedy algorithm. Our experiments show that our method achieves a better partition on average than the greedy algorithm does. Another advantage of our algorithm is that it is much faster than the greedy algorithm.

机译：最小公用字符串分区问题（MCSP）是将两个给定的输入字符串划分为相同的子字符串集合，其中将分区中的子字符串数量最小化。这个问题是基因组重排的关键问题，并且与通过重复重复进行分选的问题密切相关。 MCSP是NP硬式的，即使在最普通的情况下，也是如此2-MCSP，其中每个字母在每个输入字符串中最多出现两次。有多种近似算法可以实现非常好的近似比率，但是实现复杂，例如，针对2-MCSP的1.5近似算法，针对2-MCSP的1.1037近似算法和针对3-MCSP的4-近似算法。对于MCSP，还有一个简单的贪婪算法，该算法在每个步骤中从给定的字符串中提取最长的公共子字符串。在本文中，我们提出了一种新颖的针对MCSP的贪婪算法，该算法提取了最长的公共子字符串，该子字符串包含一个只要有一个符号就在每个步骤仅出现一次的符号。我们显示，与贪婪算法相比，我们的算法在每个步骤上的贪婪程度都更高，并且该算法的预期性能要优于贪婪算法。我们的实验表明，与贪婪算法相比，我们的方法平均可以获得更好的分区。我们算法的另一个优点是它比贪婪算法要快得多。

著录项

来源
《Bioinformatics Research and Applications; Lecture Notes in Bioinformatics; 4463》|2007年|441-452|共12页
会议地点 AtlantaGA(US)
作者
Dan He;
展开▼
作者单位

Department of Computer Science, University of Vermont Burlington, VT 05405, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类生物工程学（生物技术）;
关键词

相似文献

外文文献
中文文献
专利

1. Randomized greedy multi-start algorithm for the minimum common integer partition problem [J] . Manuel Lozano, Francisco J. Rodriguez, Daniel Peralta, Engineering Applications of Artificial Intelligence . 2016,第Apra期

机译：最小公整数分区问题的随机贪心多启动算法
2. Computational performance evaluation of two integer linear programming models for the minimum common string partition problem [J] . Blum Christian, Raidl Gunther R. Optimization Letters . 2016,第1期

机译：最小公共字符串分配问题的两个整数线性规划模型的计算性能评估
3. Mathematical programming strategies for solving the minimum common string partition problem [J] . Blum Christian, Lozano Jose A., Davidson Pinacho European Journal of Operational Research . 2015,第3期

机译：解决最小公共字符串划分问题的数学编程策略
4. The Greedy Algorithm for the Minimum Common String Partition Problem [C] . Marek Chrobak, Petr Kolman, Jiri Sgall Approximation, Randomization, and Combinatorial Optimization: Algorithms and Techniques . 2004

机译：最小公共字符串分配问题的贪婪算法
5. Analysis of data partitioning on correlated data to genetic sequence searches using string matching algorithms. [D] . Nance, David R. 2006

机译：使用字符串匹配算法对相关数据的数据分区进行遗传序列搜索的分析。
6. An Integer Programming Formulation of the Minimum Common String Partition Problem [O] . S. M. Ferdous, M. Sohel Rahman -1

机译：最小公共字符串分配问题的整数编程公式
7. The Greedy Algorithm for the Minimum Common String Partition Problem [O] . Marek Chrobak, Petr Kolman, Jiri Sgall 2004

机译：最小公共字符串分区问题的贪心算法
8. O(LOG N) Time Common CRCW Pram Algorithm for Minimum Spanning Tree. [R] . Wu, M. M. 1990

机译：用于最小生成树的O（LOG N）时间公共CRCW pram算法。

A Novel Greedy Algorithm for the Minimum Common String Partition Problem

摘要

著录项

相似文献

相关主题

期刊订阅