首页> 外文会议>Database systems for advanced applications >Filtering Techniques for Regular Expression Matching in Strings
【24h】

Filtering Techniques for Regular Expression Matching in Strings

机译:字符串中正则表达式匹配的过滤技术

获取原文
获取原文并翻译 | 示例

摘要

Matching a regular expression (regex) on a text is widely used in many applications, such as text editing, information extraction and instruction detection (IDS). Traditional algorithms generally compile an equivalent automaton from the regex query, then run it on the text to find all matching results. However, they have to scale linearly with the size of the text. Recent algorithms utilize various filtering techniques to quickly jump to candidate positions in a text where a matching result may appear, then only these candidate positions are verified by the automaton. In this paper, we give a full specification on filtering techniques for the regex matching problem, in which filters for the regex query can be classified into positive factor and negative factor. We review three typical positive factors, including prefix, suffix, and necessary factor and show that negative factors can collaborate with positive factors to significantly improve the filtering ability.
机译:在文本编辑,信息提取和指令检测(IDS)等许多应用程序中广泛使用在文本上匹配正则表达式(regex)。传统算法通常会从正则表达式查询中编译等效的自动机,然后在文本上运行它以查找所有匹配的结果。但是,它们必须随文本大小线性缩放。最近的算法利用各种过滤技术来快速跳到文本中可能出现匹配结果的候选位置,然后仅这些候选位置由自动机验证。在本文中,我们对正则表达式匹配问题的过滤技术给出了完整的规范,其中用于正则表达式查询的过滤器可以分为正因子和负因子。我们回顾了三个典型的积极因素,包括前缀,后缀和必要因素,并表明消极因素可以与积极因素协作以显着提高过滤能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号