【24h】

Approximating edit distance efficiently

机译:有效地近似编辑距离

获取原文

摘要

Edit distance has been extensively studied for the past several years. Nevertheless, no linear-time algorithm is known to compute the edit distance between two strings, or even to approximate it to within a modest factor. Furthermore, for various natural algorithmic problems such as low-distortion embeddings into normed spaces, approximate nearest-neighbor schemes, and sketching algorithms, known results for the edit distance are rather weak. We develop algorithms that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than /spl lscr/, decide which of the two holds. We present two sketching algorithms for gap versions of edit distance. Our first algorithm solves the k vs. (kn)/sup 2/3/ gap problem, using a constant size sketch. A more involved algorithm solves the stronger k vs. /spl lscr/ gap problem, where /spl lscr/ can be as small as O(k/sup 2/) - still with a constant sketch - but works only for strings that are mildly "nonrepetitive". Finally, we develop an n/sup 3/7/-approximation quasilinear time algorithm for edit distance, improving the previous best factor of n/sup 3/4/ (Cole and Hariharan, 2002); if the input strings are assumed to be nonrepetitive, then the approximation factor can be strengthened to n/sup 1/3/.
机译:在过去的几年中,对编辑距离进行了广泛的研究。但是,尚无线性时间算法可以计算两个字符串之间的编辑距离,甚至无法将其近似到一个适度的因子之内。此外,对于各种自然算法问题,例如在规范空间中的低失真嵌入,近似最近邻方案和草绘算法,编辑距离的已知结果相当弱。我们开发了解决间隙距离的编辑距离问题的算法:给定两个长度为n的字符串,并保证它们的编辑距离至多为k或大于/ spl lscr /,确定这两个条件中的哪一个成立。我们为编辑距离的间隙版本提供了两种草图绘制算法。我们的第一个算法使用恒定大小的草图解决了k vs.(kn)/ sup 2/3 /间隙问题。涉及更多的算法可以解决更强的k vs./spl lscr /间隙问题,其中/ spl lscr /可以小至O(k / sup 2 /)-仍具有恒定的草图-但仅适用于中等程度的字符串“非重复”。最后,我们为编辑距离开发了n / sup 3/7 /-近似准线性时间算法,提高了先前的n / sup 3/4 /最佳因子(Cole和Hariharan,2002);如果假定输入字符串是非重复的,则可以将近似因子增强为n / sup 1/3 /。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号