首页> 外文会议>IEEE International Conference on Software Analysis, Evolution and Reengineering >Syntax and sensibility: Using language models to detect and correct syntax errors
【24h】

Syntax and sensibility: Using language models to detect and correct syntax errors

机译:语法和灵敏度:使用语言模型来检测和正确语法错误

获取原文

摘要

Syntax errors are made by novice and experienced programmers alike; however, novice programmers lack the years of experience that help them quickly resolve these frustrating errors. Standard LR parsers are of little help, typically resolving syntax errors and their precise location poorly. We propose a methodology that locates where syntax errors occur, and suggests possible changes to the token stream that can fix the error identified. This methodology finds syntax errors by using language models trained on correct source code to find tokens that seem out of place. Fixes are synthesized by consulting the language models to determine what tokens are more likely at the estimated error location. We compare n-gram and LSTM (long short-term memory) language models for this task, each trained on a large corpus of Java code collected from GitHub. Unlike prior work, our methodology does not rely that the problem source code comes from the same domain as the training data. We evaluated against a repository of real student mistakes. Our tools are able to find a syntactically-valid fix within its top-2 suggestions, often producing the exact fix that the student used to resolve the error. The results show that this tool and methodology can locate and suggest corrections for syntax errors. Our methodology is of practical use to all programmers, but will be especially useful to novices frustrated with incomprehensible syntax errors.
机译:语法错误是由新手和经验丰富的程序员制作的;然而,新手程序员缺乏多年的经验,帮助他们快速解决这些令人沮丧的错误。标准LR解析器几乎没有帮助,通常解决语法错误及其精确位置。我们提出了一种定位在语法错误发生的地方的方法,并表明可以修复所识别错误的令牌流的可能更改。该方法通过使用正确的源代码培训的语言模型找到语法错​​误,以找到似乎不合适的令牌。通过咨询语言模型来综合修复程序以确定估计错误位置更有可能的令牌。我们比较此任务的N-GRAM和LSTM(长期内存)语言模型,每个语言模型都培训了从Github收集的Java代码的大型语料库。与事先工作不同,我们的方法不依赖于问题源代码来自同一域作为培训数据。我们评估了真正的学生错误的存储库。我们的工具能够在其前2个建议中找到一个语法有效修复,通常会产生学生用于解决错误的确切修复。结果表明,此工具和方法可以找到并建议语法错误的校正。我们的方法论对所有程序员来说都是实际用途,但对新手令人沮丧的令人沮丧的语法错误是特别有用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号