首页> 中文期刊>计算机应用 >基于通配符和长度约束的近似模式匹配算法

基于通配符和长度约束的近似模式匹配算法

     

摘要

Current works on the Approximate Pattern Matching with Wildcards and Length constraints (APMWL) problem can only cope with replacement operation. This paper proposed an Edit Distance Matrix (EDM) method based on dynamic programming and the Approximate Pattern Matching with EDM (APM) algorithm. APM can handle all approximate operations including insertion, replacement and deletion. Moreover, this paper extended APM to the APM-OF algorithm with a strict constraint condition that each character can be used at most once for pattern matching in a sequence. The experiments verify that both APM and APM-OF have significant advantages on matching solutions against other peers. The average improvement rates of matching compared to SAIL-Approx are up to 8. 34% and 12. 37% respectively. It also demonstrates an advantage on approximate pattern mining that the number of approximate patterns mined by APM-OF is 2. 07 times of that mined by OneoffMining.%针对近似模式匹配算法在处理带有灵活通配符和长度约束近似模式匹配(APMWL)问题时只能解决替换操作,提出一种基于动态规划的编辑距离矩阵(EDM)构造方法,设计了基于EDM的近似模式匹配算法APM,可以处理近似匹配中的三种编辑操作,即插入、替换和删除操作.此外,根据文本中字符是否允许被重复使用的约束条件,设计APM-OF算法.实验结果表明,APM和APM-OF与同类算法相比具备显著的优势:与Sail_Approx匹配算法实验对比,获取解的平均增长率分别达到8.34%和12.37%;将APM-OF算法应用至模式挖掘中,挖掘出的频繁近似模式个数为OneoffMining算法的2.07倍.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号