...
首页> 外文期刊>Information Processing Letters >Fast practical multi-pattern matching
【24h】

Fast practical multi-pattern matching

机译:快速实用的多模式匹配

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The multi-pattern matching problem consists finding all occurrences of the patterns from a finite set X in a given text T of length n. We present a new and simple algorithm combining the ideas of the Aho-Corasick algorithm and the directed acyclic word graphs. The algorithm has time complexity which is linear in the worst case (it makes at most 2n symbol comparisons) and has good average-case time complexity assuming the shortest pattern is sufficiently long. Denote the length of the shortest pattern by m, and the total length of all patterns by M. Assume that M is polynomial with respect to m, the alphabet contains at least 2 symbols and the text (in which the pattern is to be found) is random, for each position each letter occurs independently with the same probability. Then the average number of comparisons is O(n/m).log m), which matches the lower bound of the problem. For sufficiently large values of m the algorithm has a good behavior in practice.
机译:多样式匹配问题包括在长度为n的给定文本T中从有限集合X中找到样式的所有出现。我们提出了一种新的简单算法,结合了Aho-Corasick算法和有向无环词图的思想。该算法的时间复杂度在最坏的情况下是线性的(最多进行2n个符号比较),并且假设最短的模式足够长,则具有良好的平均情况时间复杂度。用m表示最短模式的长度,用M表示所有模式的总长度。假设M相对于m是多项式,则字母至少包含2个符号和文本(在其中可以找到模式)是随机的,对于每个位置,每个字母都以相同的概率独立出现。那么比较的平均次数为O(n / m).log m),它与问题的下限匹配。对于足够大的m值,该算法在实践中具有良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号