首页> 外文期刊>Computer science >DotStar: breaking the scalability and performance barriers in parsing regular expressions
【24h】

DotStar: breaking the scalability and performance barriers in parsing regular expressions

机译:DotStar:打破正则表达式解析的可伸缩性和性能障碍

获取原文
获取原文并翻译 | 示例
       

摘要

Regular expressions (shortened as regexp) are widely used to parse data, detect recurrent patterns and information, and are a common choice for denning configurable rules for a variety of systems. In fact, many data-intensive applications rely on regexp matching as the first line of defense to perform on-line data filtering. Unfortunately, few solutions can keep up with the increasing data rate and complexity of sets containing hundreds of expressions. In this paper we present DotStar (.*), a complete algorithmic solution and a software tool-chain, that can compile large sets of regexp into an automaton that can take advantage of the vector/SIMD extensions available on many commodity multi-core processors. DotStar relies on several algorithmic innovations to transform the user-provided regexp set into a sequence of manageable intermediate representations. The resulting automaton is both space and time efficient, and can search in a single pass without backtracking. The experimental evaluation, performed on a family of state-of-the-art processors, shows that DotStar can efficiently handle both small sets of regexp, used in protocol parsing, and larger sets designed for Network Intrusion Detection Systems (NIDS), achieving a performance between 1 and 5 Gbit/sec per core.
机译:正则表达式(缩写为regexp)被广泛用于解析数据,检测循环模式和信息,并且是为各种系统定义可配置规则的常见选择。实际上,许多数据密集型应用程序都将正则表达式匹配作为执行在线数据过滤的第一道防线。不幸的是,很少有解决方案可以跟上不断增长的数据速率和包含数百个表达式的集合的复杂性。在本文中,我们介绍了DotStar(。*),这是一个完整的算法解决方案和软件工具链,可以将大量正则表达式编译为一个自动机,从而可以利用许多商用多核处理器上可用的vector / SIMD扩展。 DotStar依靠多种算法创新将用户提供的正则表达式集转换为一系列可管理的中间表示形式。产生的自动机既节省空间又节省时间,并且可以一次搜索而无需回溯。在一系列最先进的处理器上进行的实验评估表明,DotStar可以有效地处理协议解析中使用的少量正则表达式集和为网络入侵检测系统(NIDS)设计的较大集,从而实现每个内核1至5 Gbit / sec之间的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号