首页> 外文会议>International conference on web information systems engineering >Learning Restricted Deterministic Regular Expressions with Counting
【24h】

Learning Restricted Deterministic Regular Expressions with Counting

机译:通过计数学习受限的确定性正则表达式

获取原文

摘要

Regular expressions are widely used in various fields. Learning regular expressions from sequence data is still a popular topic. Since many XML documents are not accompanied by a schema, or a valid schema, learning regular expressions from XML documents becomes an essential work. In this paper, we propose a restricted subclass of single-occurrence regular expressions with counting (RCsores) and give a learning algorithm of RCsores. First, we learn a single-occurrence regular expressions (SORE). Then, we construct an equivalent countable finite automaton (CFA). Next, the CFA runs on the given finite sample to obtain an updated CFA, which contains counting operators occurring in an RCsore. Finally we transform the updated CFA to an RCsore. Moreover, our algorithm can ensure the result is a minimal generalization (such generalization is called descriptive) of the given finite sample.
机译:正则表达式广泛用于各个领域。从序列数据中学习正则表达式仍然是一个热门话题。由于许多XML文档没有伴随模式或有效模式,因此从XML文档中学习正则表达式成为一项必不可少的工作。在本文中,我们提出了带有计数的单次正则表达式的受限子类(RCsores),并给出了RCsores的学习算法。首先,我们学习单次出现的正则表达式(SORE)。然后,我们构造一个等效的可数有限自动机(CFA)。接下来,CFA在给定的有限样本上运行以获得更新的CFA,其中包含RCsore中发生的计数运算符。最后,我们将更新的CFA转换为RCsore。而且,我们的算法可以确保结果是给定有限样本的最小泛化(这种泛化称为描述性)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号