首页> 外文会议>IEEE/ACM International Conference on Automated Software Engineering >Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports
【24h】

Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports

机译:将深度学习与信息检索结合起来,以本地化错误报告的错误文件

获取原文

摘要

Bug localization refers to the automated process of locating the potential buggy files for a given bug report. To help developers focus their attention to those files is crucial. Several existing automated approaches for bug localization from a bug report face a key challenge, called lexical mismatch, in which the terms used in bug reports to describe a bug are different from the terms and code tokens used in source files. This paper presents a novel approach that uses deep neural network (DNN) in combination with rVSM, an information retrieval (IR) technique. rVSM collects the feature on the textual similarity between bug reports and source files. DNN is used to learn to relate the terms in bug reports to potentially different code tokens and terms in source files and documentation if they appear frequently enough in the pairs of reports and buggy files. Our empirical evaluation on real-world projects shows that DNN and IR complement well to each other to achieve higher bug localization accuracy than individual models. Importantly, our new model, HyLoc, with a combination of the features built from DNN, rVSM, and project's bug-fixing history, achieves higher accuracy than the state-of-the-art IR and machine learning techniques. In half of the cases, it is correct with just a single suggested file. Two out of three cases, a correct buggy file is in the list of three suggested files.
机译:错误的定位是指查找潜在的马车文件对于给定的错误报告的自动化过程。为了帮助开发者把注意力集中到这些文件是至关重要的。一些现有的自动化方法从一个bug报告面临的一个重大挑战,被称为词汇不匹配,其中在错误报告用来描述一个错误的术语是从不同的方面和代码的令牌源文件中使用的错误定位。本文提出了一种新颖的方法,在与RVSM,信息检索(IR)技术组合使用深层神经网络(DNN)。 RVSM收集有关错误报告和源文件之间的文本相似的功能。 DNN是用来学习,如果他们频繁出现在够对报告和马车文件中错误报告的条款涉及到可能不同的代码标记和源文件和文档方面。我们对真实世界的项目经验表明该评价DNN和IR以及补充彼此以实现更高的错误定位精度比单独的模型。重要的是,我们的新模式,HyLoc,与来自DNN,RVSM和项目的bug修复历史建的特征相结合,实现了比国家的最先进的IR更高的精度和机器学习技术。在半数的病例,这是只是一个单一的建议的文件正确。三分之二的情况下,正确的马车文件是三个建议的文件列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号