首页> 外文会议>International conference on database systems for advanced applications >An Adaptive Approach of Approximate Substring Matching
【24h】

An Adaptive Approach of Approximate Substring Matching

机译:近似子串匹配的自适应方法

获取原文

摘要

Approximate substring matching is a common problem in many applications. In this paper, we study approximate substring matching with edit distance constraints. Existing methods are very sensitive to query strings or query parameters like query length and edit distance. To address the problem, we propose a new approach using partition scheme. It first partitions a query into several segments, and finds matching substrings of these segments as candidates, then performs a bidirectional verification on these candidates to get final results. We devise an even partition scheme to efficiently find candidates, and a best partition scheme to find high quality candidates. Furthermore, through theoretical analysis, we find that the best partition scheme cannot always outperform the even partition scheme. Thus we propose an adaptive approach for selectively choosing scheme using statistic knowledge. We conduct comprehensive experiments to demonstrate the efficiency and quality of our proposed method.
机译:子字符串的近似匹配是许多应用程序中的常见问题。在本文中,我们研究具有编辑距离约束的近似子串匹配。现有方法对查询字符串或查询参数(如查询长度和编辑距离)非常敏感。为了解决这个问题,我们提出了一种使用分区方案的新方法。它首先将查询分为几个段,并找到这些段的匹配子串作为候选,然后对这些候选执行双向验证以获取最终结果。我们设计了一个均匀的分区方案来有效地找到候选者,并设计一个最佳的分区方案来找到高质量的候选者。此外,通过理论分析,我们发现最佳的分区方案不能总是优于偶数分区方案。因此,我们提出了一种自适应方法,用于使用统计知识有选择地选择方案。我们进行了全面的实验,以证明我们提出的方法的效率和质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号