首页> 中文期刊> 《计算机学报》 >一种适合于超大规模特征集的匹配方法

一种适合于超大规模特征集的匹配方法

         

摘要

串匹配技术是入侵检测系统中的关键技术,随着特征数量的增加,现有的自动机类匹配算法都会面对内存占用过大的问题.当特征超过一定数目后,自动机可能根本无法构造.文中提出了一种针对超大规模特征匹配(SLSPM)环境的匹配算法SLSPM.SLSPM算法借助一个块式匹配自动机和若干个普通自动机完成匹配工作,而且能够支持至少上万规模的特征集.与普通匹配自动机先读入状态再判断读入符号的方式不同,SLSPM首先使用散列函数判断当前文本块是否可以被过滤掉.如果文本块无法被过滤且为合法文本块时,再检查当前状态是否是一个能够识别当前文本块的状态.仅在当前状态吻合的情况下再读入下一个文本块进行后续匹配.理论证明显示SLSPM算法具有近似O(n)的复杂度.由于SLSPM算法未能保存全部的跳转信息,其匹配速度相对于高级Aho-Corasick算法未有大幅提升.算法的优势在于,该算法在软件环境下能够维持与AC算法相同的匹配性能,而且能够将特征加载规模至少提升至上万以适应超大规模特征集匹配环境.%The current string matching algorithms nearly can not afford the burden of large memorydemand when the patters amount increases dramatically.Matching automaton can not be estab-lished at all when the amount of patterns is at least tens of thousands.We present a solution tothe problem of super large scale patterns matching (SLSPM).In our design,a matching trie isdivided into one block matching trie and many general character matching tries if possible.Duringa block matching procedure our block matching automaton (trie)does not read the current statefirst.Instead,the automaton first reads the current text block symbol and decides whether it willbe matched or not by a hash function.Then,the automaton looks for the current state in thestates set in which all the states recognize the same current text block symbol.After the currentstate is found the automaton continues to read the next text block symbol.The theoretical analysisshows that under the worst case the proposed algorithm takes O(n)time approximately,where n is the length of the text.The experiment results show that our design matches only a little fasterthan the advanced Aho-Corasick because in the advanced Aho-Corasick the entire possible transitioninformation has been stored.The advantage of SLSPMis that under software environmentSLSPMis not slower than AC during the matching procedure,and also at least tens of thousands patterns can be loaded into the hybrid automatons of SLSPMso that it can be used well for superlarge scale patters matching environment.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号