首页> 外文会议>International Colloquium on Theoretical Aspects of Computing >Optimally Streaming Greedy Regular Expression Parsing
【24h】

Optimally Streaming Greedy Regular Expression Parsing

机译:最佳流贪婪的贪婪常规表达式解析

获取原文

摘要

We study the problem of streaming regular expression parsing: Given a regular expression and an input stream of symbols, how to output a serialized syntax tree representation as an output stream during input stream processing. We show that optimally streaming regular expression parsing, outputting bits of the output as early as is semantically possible for any regular expression of size m and any input string of length n, can be performed in time O(2~(m log m) + mn) on a unit-cost random-access machine. This is for the wide-spread greedy disambiguation strategy for choosing parse trees of grammatically ambiguous regular expressions. In particular, for a fixed regular expression, the algorithm's run-time scales linearly with the input string length. The exponential is due to the need for preprocessing the regular expression to analyze state coverage of its associated NFA, a PSPACE-hard problem, and tabulating all reachable ordered sets of NFA-states. Previous regular expression parsing algorithms operate in multiple phases, always requiring processing or storing the whole input string before outputting the first bit of output, not only for those regular expressions and input prefixes where reading to the end of the input is strictly necessary.
机译:我们研究流媒体常规表达式解析的问题:给定正则表达式和符号输入流,如何在输入流处理期间将序列化语法树表示输出为输出流。我们表明,最佳流媒体正则表达式解析,尽可能早地输出输出的比特,以便在时间o(2〜(m log m)+的时间o(2〜(m log m)+上执行MN)在单位成本随机接入机上。这是为了广泛传播的贪婪歧义策略,用于选择语法模糊正则表达式的解析树。特别是,对于固定的正则表达式,算法的运行时间与输入字符串长度线性缩放。指数是由于需要预处理正则表达式来分析其相关NFA的状态覆盖,PSPACE难以解决和制表所有可达有序的NFA状态。以前的正则表达式解析算法以多个阶段运行,始终要求在输出第一位输出之前处理或存储整个输入串,不仅适用于那些正则表达式和输入前缀,其中严格必要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号