首页> 外文期刊>Empirical Software Engineering >Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
【24h】

Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

机译:这个本地化工具对这个错误有效吗?减轻基于信息检索的不可靠基于错误的本地化工具的影响

获取原文
获取原文并翻译 | 示例

摘要

Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed. However, there are no perfect tools that can successfully localize faults within a few number of most suspicious program elements for every single input bug report. Therefore, it is difficult for developers to decide which tool would be effective for a given bug report. Furthermore, for some bug reports, no bug localization tools would be useful. Even a state-of-the-art bug localization tool outputs many ranked lists where buggy files appear very low in the lists. This potentially causes developers to distrust bug localization tools. In this work, we build an oracle that can automatically predict whether a ranked list produced by an IR-based bug localization tool is likely to be effective or not. We consider a ranked list to be effective if a buggy file appears in the top-N position of the list. If a ranked list is unlikely to be effective, developers do not need to waste time in checking the recommended files one by one. In such cases, it is better for developers to use traditional debugging methods or request for further information to localize bugs. To build this oracle, our approach extracts features that can be divided into four categories: score features, textual features, topic model features, and metadata features. We build a separate prediction model for each category, and combine them to create a composite prediction model which is used as the oracle. We name this solution APRILE, which stands for Automated PRediction of IR-based Bug Localization's Effectiveness. We further integrate APRILE with two other components that are learned using our bagging-based ensemble classification (BEC) method. We refer to the extension of APRILE as APRILE (+). We have evaluated APRILE (+) to predict the effectiveness of three state-of-the-art IR-based bug localization tools on more than three thousands bug reports from AspectJ, Eclipse, SWT, and Tomcat. APRILE (+) can achieve an average precision, recall, and F-measure of 77.61 %, 88.94 %, and 82.09 %, respectively. Furthermore, APRILE (+) outperforms a baseline approach by Le and Lo and APRILE by up to a 17.43 % and 10.51 % increase in F-measure respectively.
机译:基于信息检索(IR)的错误本地化方法将处理文本错误报告和源代码文件集合,以查找错误文件。他们输出按其包含漏洞的可能性排序的文件排名列表。最近,已经提出了几种基于IR的错误本地化工具。但是,没有完美的工具可以为每个输入的错误报告成功地在几个最可疑的程序元素中定位故障。因此,开发人员很难确定哪种工具对给定的错误报告将有效。此外,对于某些错误报告,没有错误本地化工具会有用。即使是最新的错误本地化工具,也会输出许多排名列表,其中错误文件在列表中的位置非常低。这可能导致开发人员不信任错误本地化工具。在这项工作中,我们建立了一个可以自动预测由基于IR的错误本地化工具生成的排名列表的预言家。如果有问题的文件出现在列表的前N个位置,我们认为排序列表是有效的。如果排名列表不太可能有效,则开发人员无需浪费时间逐一检查推荐的文件。在这种情况下,对于开发人员而言,最好使用传统的调试方法或请求进一步的信息来本地化错误。为了构建此Oracle,我们的方法提取的特征可以分为四类:得分特征,文本特征,主题模型特征和元数据特征。我们为每个类别构建一个单独的预测模型,并将它们组合在一起以创建一个用作Oracle的复合预测模型。我们将此解决方案命名为APRILE,代表基于IR的错误本地化有效性的自动预测。我们进一步将APRILE与其他两个组件集成在一起,这两个组件是使用基于包装的集成分类(BEC)方法学习的。我们将APRILE的扩展称为APRILE(+)。我们对APRILE(+)进行了评估,以预测来自AspectJ,Eclipse,SWT和Tomcat的三千多个bug报告中三个基于IR的最新Bug本地化工具的有效性。 APRILE(+)可以分别达到77.61%,88.94%和82.09%的平均精度,召回率和F值。此外,在F度量中,APRILE(+)优于Le和Lo和APRILE的基线方法分别增加了17.43%和10.51%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号