首页> 外文会议>IEEE International Parallel Distributed Processing Symposium >Hardware-Accelerated Regular Expression Matching with Overlap Handling on IBM PowerEN Processor
【24h】

Hardware-Accelerated Regular Expression Matching with Overlap Handling on IBM PowerEN Processor

机译:硬件加速的正则表达式匹配与IBM PowerEN处理器上的重叠处理

获取原文

摘要

Programmable hardware accelerators for regular expression (regex) matching are evolving into increasingly complex stream processors, which involve multiple state machines that operate in parallel, and specialized post-processors that can process instructions dispatched by the state machines. To improve the speed and the storage-efficiency, complex regexs are decomposed into simpler sub expressions, where each sub expression can fire one or more instructions. Although the impact of regex decompositions on the storage efficiency is well-known, little has been done to address the correctness and completeness. We show that regex decompositions can resultin false positives if overlaps between sub expressions are not taken into account. We describe formal methods to recognize various types of sub expression overlaps that can arise in regex decompositions. We also describe efficient post-processing techniques to eliminate the associated false positives. To enable efficient mapping of the decomposed regexs to the post-processors, we propose integer programming based register allocation methods. Our methods pack narrow variables to reduce the register and instruction usage, and take advantage of multi-register reset instructions to reduce the number of instructions that must be executed in parallel. Experiments on regex sets obtained from open-source and proprietary network intrusion detection systems demonstrate orders of magnitude improvement in the storage efficiency over state-of-the-art.
机译:用于正则表达式(regex)匹配的可编程硬件加速器正在演变为越来越复杂的流处理器,该流处理器涉及并行运行的多个状态机以及可以处理状态机调度的指令的专用后处理器。为了提高速度和存储效率,将复杂的正则表达式分解为更简单的子表达式,其中每个子表达式可以触发一个或多个指令。尽管正则表达式分解对存储效率的影响是众所周知的,但几乎没有做任何工作来解决正确性和完整性。我们表明,如果不考虑子表达式之间的重叠,则正则表达式分解会导致误报。我们描述了识别正规表达式分解中可能出现的各种类型的子表达式重叠的形式方法。我们还描述了有效的后处理技术,以消除相关的误报。为了使分解后的正则表达式有效映射到后处理器,我们提出了基于整数编程的寄存器分配方法。我们的方法打包了狭窄的变量以减少寄存器和指令的使用,并利用多寄存器复位指令来减少必须并行执行的指令数量。从开放源代码和专有网络入侵检测系统获得的正则表达式集的实验表明,与现有技术相比,存储效率提高了几个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号