An empirical study of bugs in test code

机译：对测试代码中的错误的实证研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Testing aims at detecting (regression) bugs in production code. However, testing code is just as likely to contain bugs as the code it tests. Buggy test cases can silently miss bugs in the production code or loudly ring false alarms when the production code is correct. We present the first empirical study of bugs in test code to characterize their prevalence and root cause categories. We mine the bug repositories and version control systems of 211 Apache Software Foundation (ASF) projects and find 5,556 test-related bug reports. We (1) compare properties of test bugs with production bugs, such as active time and fixing effort needed, and (2) qualitatively study 443 randomly sampled test bug reports in detail and categorize them based on their impact and root causes. Our results show that (1) around half of all the projects had bugs in their test code; (2) the majority of test bugs are false alarms, i.e., test fails while the production code is correct, while a minority of these bugs result in silent horrors, i.e., test passes while the production code is incorrect; (3) incorrect and missing assertions are the dominant root cause of silent horror bugs; (4) semantic (25%), flaky (21%), environment-related (18%) bugs are the dominant root cause categories of false alarms; (5) the majority of false alarm bugs happen in the exercise portion of the tests, and (6) developers contribute more actively to fixing test bugs and test bugs are fixed sooner compared to production bugs. In addition, we evaluate whether existing bug detection tools can detect bugs in test code.

机译：测试旨在检测（回归）生产代码中的错误。但是，测试代码包含的错误与其测试的代码一样可能。错误的测试用例可以静默地遗漏生产代码中的错误，或者在生产代码正确时大声地响起错误警报。我们提供了对测试代码中的错误的第一个实证研究，以表征它们的普遍性和根本原因类别。我们挖掘了211 Apache Software Foundation（ASF）项目的错误存储库和版本控制系统，并找到了5,556个与测试相关的错误报告。我们（1）比较测试错误和生产错误的属性，例如活动时间和所需的修复工作，以及（2）定性详细研究443个随机抽样的测试错误报告，并根据其影响和根本原因对其进行分类。我们的结果表明：（1）大约一半的项目的测试代码中都有错误; （2）大多数测试错误是错误警报，即在生产代码正确的情况下测试失败，而其中的少数错误会导致无声的恐怖，即在生产代码错误的情况下通过测试; （3）错误和断言是无声恐怖错误的主要根源; （4）语义错误（25％），片状错误（21％），与环境相关的错误（18％）是错误警报的主要根本原因类别; （5）大多数错误警报错误发生在测试的练习部分中，并且（6）开发人员在修复测试错误方面做出了更加积极的贡献，并且与生产错误相比，测试错误的修复速度更快。另外，我们评估现有的错误检测工具是否可以检测测试代码中的错误。

著录项

来源
《International Conference on Software Maintenance and Evolution》|2015年|101-110|共10页
会议地点
作者
Vahabzadeh Arash; Fard Amin Milani; Mesbah Ali;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
configuration management; program debugging; program testing; ASF projects; Apache Software Foundation projects; bug detection tools; bug repositories; false alarm bugs; production bugs; production code; randomly sampled test bug reports; root cause categories; silent horror bugs; test code bugs; test-related bug reports; version control systems; Computer bugs; Control systems; Data collection; Data mining; Production; Software; Testing; Bugs; empirical study; test code;

机译：配置管理;程序调试;程序测试; ASF项目; Apache Software Foundation项目;错误检测工具;错误存储库;错误警报bug;生产bug;生产代码;随机抽样的测试bug报告;根本原因类别;沉默的恐怖bug;测试代码错误;与测试相关的错误报告;版本控制系统;计算机错误;控制系统;数据收集;数据挖掘;生产;软件;测试;错误;经验研究;测试代码;

相似文献

外文文献
中文文献
专利

1. Comparing Software Bugs in Clone and Non-clone Code: An Empirical Study [J] . Judith F. Islam, Manishankar Mondal, Chanchal K. Roy, International journal of software engineering and knowledge engineering . 2017,第9a10期

机译：比较克隆代码和非克隆代码中的软件错误：实证研究
2. An empirical study on bug propagation through code cloning [J] . Mondal Manishankar, Roy Banani, Roy Chanchal K., The Journal of Systems and Software . 2019,第Deca期

机译：通过代码克隆传播错误的实证研究
3. Towards understanding bugs in an open source cloud management stack: An empirical study of OpenStack software bugs [J] . Zheng Wei, Feng Chen, Yu Tingting, The Journal of Systems and Software . 2019,第MAY期

机译：试图理解开源云管理堆栈中的错误：对OpenStack软件错误的实证研究
4. Code coverage and test suite effectiveness: Empirical study with real bugs in large systems [C] . Kochhar Pavneet Singh, Thung Ferdian, Lo David International Conference on Software Analysis, Evolution, and Reengineering . 2015

机译：代码覆盖率和测试套件有效性：大型系统中真实错误的实证研究
5. Empirical Studies of Performance Bugs and Performance Analysis Approaches for Software Systems. [D] . Zaman, Shahed. 2012

机译：软件系统性能错误和性能分析方法的实证研究。
6. The KA/KS Ratio Test for Assessing the Protein-Coding Potential of Genomic Regions: An Empirical and Simulation Study [O] . Anton Nekrutenko, Kateryna D. Makova, Wen-Hsiung Li 2002

机译：评估基因组区域蛋白质编码潜力的KA / KS比检验：一项经验与模拟研究
7. An Empirical Validation of the Complexity of Code Changes and Bugs in Predicting the Release Time of Open Source Software [O] . Chaurvedi K. K., Bedi Punam, Misra Sanjay, 2013

机译：对代码更改和错误在预测开源软件发布时间方面的复杂性的经验验证

An empirical study of bugs in test code

摘要

著录项

相似文献

相关主题

期刊订阅