Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

Le Tien-Duy B.; Thung Ferdian; Lo David

首页> 外文期刊>Empirical Software Engineering >Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

【24h】

Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

机译：这个本地化工具对这个错误有效吗？减轻基于信息检索的不可靠基于错误的本地化工具的影响

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed. However, there are no perfect tools that can successfully localize faults within a few number of most suspicious program elements for every single input bug report. Therefore, it is difficult for developers to decide which tool would be effective for a given bug report. Furthermore, for some bug reports, no bug localization tools would be useful. Even a state-of-the-art bug localization tool outputs many ranked lists where buggy files appear very low in the lists. This potentially causes developers to distrust bug localization tools. In this work, we build an oracle that can automatically predict whether a ranked list produced by an IR-based bug localization tool is likely to be effective or not. We consider a ranked list to be effective if a buggy file appears in the top-N position of the list. If a ranked list is unlikely to be effective, developers do not need to waste time in checking the recommended files one by one. In such cases, it is better for developers to use traditional debugging methods or request for further information to localize bugs. To build this oracle, our approach extracts features that can be divided into four categories: score features, textual features, topic model features, and metadata features. We build a separate prediction model for each category, and combine them to create a composite prediction model which is used as the oracle. We name this solution APRILE, which stands for Automated PRediction of IR-based Bug Localization's Effectiveness. We further integrate APRILE with two other components that are learned using our bagging-based ensemble classification (BEC) method. We refer to the extension of APRILE as APRILE (+). We have evaluated APRILE (+) to predict the effectiveness of three state-of-the-art IR-based bug localization tools on more than three thousands bug reports from AspectJ, Eclipse, SWT, and Tomcat. APRILE (+) can achieve an average precision, recall, and F-measure of 77.61 %, 88.94 %, and 82.09 %, respectively. Furthermore, APRILE (+) outperforms a baseline approach by Le and Lo and APRILE by up to a 17.43 % and 10.51 % increase in F-measure respectively.

机译：基于信息检索（IR）的错误本地化方法将处理文本错误报告和源代码文件集合，以查找错误文件。他们输出按其包含漏洞的可能性排序的文件排名列表。最近，已经提出了几种基于IR的错误本地化工具。但是，没有完美的工具可以为每个输入的错误报告成功地在几个最可疑的程序元素中定位故障。因此，开发人员很难确定哪种工具对给定的错误报告将有效。此外，对于某些错误报告，没有错误本地化工具会有用。即使是最新的错误本地化工具，也会输出许多排名列表，其中错误文件在列表中的位置非常低。这可能导致开发人员不信任错误本地化工具。在这项工作中，我们建立了一个可以自动预测由基于IR的错误本地化工具生成的排名列表的预言家。如果有问题的文件出现在列表的前N个位置，我们认为排序列表是有效的。如果排名列表不太可能有效，则开发人员无需浪费时间逐一检查推荐的文件。在这种情况下，对于开发人员而言，最好使用传统的调试方法或请求进一步的信息来本地化错误。为了构建此Oracle，我们的方法提取的特征可以分为四类：得分特征，文本特征，主题模型特征和元数据特征。我们为每个类别构建一个单独的预测模型，并将它们组合在一起以创建一个用作Oracle的复合预测模型。我们将此解决方案命名为APRILE，代表基于IR的错误本地化有效性的自动预测。我们进一步将APRILE与其他两个组件集成在一起，这两个组件是使用基于包装的集成分类（BEC）方法学习的。我们将APRILE的扩展称为APRILE（+）。我们对APRILE（+）进行了评估，以预测来自AspectJ，Eclipse，SWT和Tomcat的三千多个bug报告中三个基于IR的最新Bug本地化工具的有效性。 APRILE（+）可以分别达到77.61％，88.94％和82.09％的平均精度，召回率和F值。此外，在F度量中，APRILE（+）优于Le和Lo和APRILE的基线方法分别增加了17.43％和10.51％。

著录项

来源
《Empirical Software Engineering》 |2017年第4期|2237-2279|共43页
作者
Le Tien-Duy B.; Thung Ferdian; Lo David;
展开▼
作者单位

Singapore Management Univ, Sch Informat Syst, Singapore, Singapore;

Singapore Management Univ, Sch Informat Syst, Singapore, Singapore;

Singapore Management Univ, Sch Informat Syst, Singapore, Singapore;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Text classification; Information retrieval; Bug reports; Bug localization; Effectiveness prediction;

机译：文本分类;信息检索;错误报告;错误定位;效果预测;

相似文献

外文文献
中文文献
专利

1. Are datasets for information retrieval-based bug localization techniques trustworthy? Impact analysis of bug types on IRBL [J] . Kim Misoo, Lee Eunseok Empirical Software Engineering . 2021,第3期

机译：数据集是基于信息检索的错误本地化技术可信赖吗？ IRBL上的错误类型的影响分析
2. On the relationship between bug reports and queries for text retrieval-based bug localization [J] . Chris Mills, Esteban Parra, Jevgenija Pantiuchina, Empirical Software Engineering . 2020,第5期

机译：关于基于文本检索的错误本地化的错误报告与查询的关系
3. Using bug descriptions to reformulate queries during text-retrieval-based bug localization [J] . Chaparro Oscar, Florez Juan Manuel, Marcus Andrian Empirical Software Engineering . 2019,第5期

机译：在基于文本检索的错误本地化过程中使用错误描述重新构造查询
4. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports [C] . Zhou Jian, Zhang Hongyu, Lo David Software Engineering (ICSE), 2012 34th International Conference on . 2012

机译：错误应在哪里修复？基于错误报告的基于信息检索的更准确的错误本地化
5. The accuracy of information retrieval based bug localization techniques. [D] . Beard, Matthew D. 2011

机译：基于信息检索的错误本地化技术的准确性。
6. Barcoding Bugs: DNA-Based Identification of the True Bugs (Insecta: Hemiptera: Heteroptera) [O] . Doo-Sang Park, Robert Foottit, Eric Maw, 2011

机译：条形码错误：真错误的基于DNa的鉴定（昆虫纲：半翅目：异翅亚目）
7. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports [O] . Jian Zhou, Hongyu Zhang, David Lo 2012

机译：错误应该在哪里修复？基于错误报告的基于Revical的基于错误定位更准确的信息

Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅