首页> 外文期刊>IEEE/ACM Transactions on Networking >A De-Compositional Approach to Regular Expression Matching for Network Security
【24h】

A De-Compositional Approach to Regular Expression Matching for Network Security

机译:网络安全正则表达式匹配的分解方法

获取原文
获取原文并翻译 | 示例

摘要

Regular Expression (RegEx) matching is the industry standard for Deep Packet Inspection (DPI) because RegExes are significantly more expressive than strings. To achieve high matching speed, we need to convert the RegExes to Deterministic Finite State Automata (DFA). However, DFA has the state explosion problem, that is, the number of DFA states and transitions can be exponential with the number of RegExes. Much work has addressed the DFA state explosion problem; however, none has met all the requirements of fast and automated construction, small memory image, and high matching speed. In this paper, we propose a decompositional approach, with fast and automated construction, small memory image, and high matching speed, to DFA state explosion. The first key idea is to decompose a complex RegEx that cause exponential state increases into a set of simpler RegExes that do not cause exponential state increases, where any character string that matches the complex RegEx also matches all the RegExes in the set of simpler RegExes; that is, the set of strings that match the complex RegEx is a subset of strings that match the set of simpler RegExes. The second key idea is to use a stateful post-processing engine to filter the matches that are actually the matches of the complex RegEx. Given an input string for matching, instead of using the large DFA constructed from the original complex RegEx to perform the matching, we first use the small DFA constructed from the set of simpler RegExes to perform the matching, and then, if the small DFA reports a match, we use the post-processing engine to determine whether it is a true match to the original complex RegEx. Because the pre-processing is simple, automaton construction can be automated and fast, and because most on-line processing is done by a DFA, its matching speed is close to that of a DFA alone. Our experimental results show that our decompositional approach achieves orders of magnitude faster DFA construction (in terms of seconds instead of minutes), 30 times smaller memory image, and 43 faster matching speeds, than state-of-the-art software based RegEx matching algorithms.
机译:正则表达式(RegEx)匹配是深度数据包检查(DPI)的行业标准,因为RegExes比字符串具有更高的表现力。为了实现高匹配速度,我们需要将RegExes转换为确定性有限状态自动机(DFA)。但是,DFA存在状态爆炸问题,即DFA状态和转换的数量与RegExes的数量成指数关系。许多工作已经解决了DFA状态爆炸问题;但是,没有一个能够满足快速和自动构造,小内存映像和高匹配速度的所有要求。在本文中,我们提出了一种分解方法,该方法具有快速且自动化的构造,较小的内存映像以及较高的匹配速度,可以解决DFA状态爆炸问题。第一个关键思想是将导致指数状态增加的复杂RegEx分解为一组不会引起指数状态增加的简单RegEx,其中,与复杂RegEx匹配的任何字符串也将与这组简单RegExes中的所有RegEx匹配;也就是说,与复杂RegEx匹配的字符串集是与简单RegExes匹配的字符串的子集。第二个关键思想是使用有状态的后处理引擎来过滤匹配项,这些匹配项实际上是复杂RegEx的匹配项。给定用于匹配的输入字符串,我们首先使用从较简单的RegExes集合中构造的小DFA进行匹配,而不是使用从原始复杂RegEx构造的大DFA进行匹配,然后,如果小DFA报告匹配,我们使用后处理引擎来确定它是否与原始的复杂RegEx真正匹配。因为预处理很简单,所以自动机的构建可以自动化且快速,并且由于大多数在线处理都是由DFA完成的,因此其匹配速度接近于单独的DFA。我们的实验结果表明,与基于最新软件的RegEx匹配算法相比,我们的分解方法可将DFA的构建速度提高数个数量级(以秒为单位,而不是分钟),将存储映像缩小30倍,并且匹配速度提高43倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号