SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

机译：SemRegex：从自然语言规范生成正则表达式的基于语义的方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications. These approaches typically train a sequence-to-sequence learning model using a syntax-based objective: maximum likelihood estimation (MLE). Such syntax-based approaches do not effectively address the goal of generating semantically correct programs, because these approaches fail to handle Program Aliasing, i.e.. semantically equivalent programs may have many syntactically different forms. To address this issue, in this paper, we propose a semantics-based approach named SemRegex. SemRegex provides solutions for a subtask of the program-synthesis problem: generating regular expressions from natural language. Different from the existing syntax-based approaches, SemRegex trains the model by maximizing the expected semantic correctness of the generated regular expressions. The semantic correctness is measured using the DFA-equivalence oracle, random test cases, and distinguishing test cases. The experiments on three public datasets demonstrate the superiority of SemRegex over the existing state-of-the-art approaches.

机译：最近的研究提出了基于语法的方法来解决从自然语言规范生成程序的问题。这些方法通常使用基于语法的目标：最大似然估计（MLE）来训练序列到序列学习模型。这样的基于语法的方法不能有效地解决生成语义上正确的程序的目标，因为这些方法无法处理程序别名，即，语义上等效的程序可能具有许多语法上不同的形式。为了解决这个问题，在本文中，我们提出了一种基于语义的方法，名为SemRegex。 SemRegex为程序综合问题的子任务提供了解决方案：从自然语言生成正则表达式。与现有的基于语法的方法不同，SemRegex通过最大化生成的正则表达式的预期语义正确性来训练模型。语义正确性是使用DFA等效性Oracle，随机测试用例和区分测试用例来衡量的。在三个公共数据集上进行的实验证明了SemRegex优于现有的最新方法。

著录项

来源
《Conference on empirical methods in natural language processing》|2018年|1608-1618|共11页
会议地点
作者
Zexuan Zhong; Jiaqi Guo; Wei Yang; Jian Peng; Tao Xie; Jian-Guang Lou; Ting Liu; Dongmei Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An algorithmic approach for checking closure properties of temporal logic specifications and omega-regular languages [J] . Peled D., Wolper P., Wilke T. Theoretical computer science . 1998,第2期

机译：一种检查时间逻辑规范和欧米茄常规语言的闭包特性的算法方法
2. Specification of fault-tolerant system issues by predicate/transition nets and regular expressions-approach and case study [J] . Belli F., Grosspietsch K.-E. IEEE Transactions on Software Engineering . 1991,第6期

机译：通过谓词/过渡网和正则表达式指定容错系统问题的方法和案例研究
3. Generating Natural Language specifications from UML class diagrams [J] . Farid Meziane, Nikos Athanasakis, Sophia Ananiadou Requirements Engineering . 2008,第1期

机译：从UML类图生成自然语言规范
4. SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications [C] . Zexuan Zhong, Jiaqi Guo, Wei Yang, Conference on empirical methods in natural language processing . 2018

机译：SemRegex：一种基于语义的方法，用于生成自然语言规格的正则表达式
5. Semantics-Based Approach for Generating Partial Views from Linked Life-Cycle Highway Project Data [D] . Le, Tuyen Thanh. 2017

机译：基于语义的从链接的生命周期公路项目数据生成局部视图的方法
6. Controlled Vocabularies Indexing and Medical Language Processing. Medical Language Processing: Database Capture of Natural Language Echocardiographic Reports: A Unified Medical Language System Approach [O] . K. Canfield, B. Bray, S. Huff, 1989

机译：受控词汇表索引编制和医学语言处理。医学语言处理：自然语言超声心动图报告的数据库捕获：统一医学语言系统方法
7. Using Semantic Unification to Generate Regular Expressions from Natural Language [O] . Kushman Nate, Barzilay Regina 2013

机译：用语义统一从自然语言生成正则表达式

SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

摘要

著录项

相似文献

相关主题

期刊订阅