首页> 外文期刊>Software Quality Journal >Does choice of mutation tool matter?
【24h】

Does choice of mutation tool matter?

机译:突变工具的选择重要吗?

获取原文
获取原文并翻译 | 示例
           

摘要

Though mutation analysis is the primary means of evaluating the quality of test suites, it suffers from inadequate standardization. Mutation analysis tools vary based on language, when mutants are generated (phase of compilation), and target audience. Mutation tools rarely implement the complete set of operators proposed in the literature and mostly implement at least a few domain-specific mutation operators. Thus different tools may not always agree on the mutant kills of a test suite. Few criteria exist to guide a practitioner in choosing the right tool for either evaluating effectiveness of a test suite or for comparing different testing techniques. We investigate an ensemble of measures for evaluating efficacy of mutants produced by different tools. These include the traditional difficulty of detection, strength of minimal sets, and the diversity of mutants, as well as the information carried by the mutants produced. We find that mutation tools rarely agree. The disagreement between scores can be large, and the variation due to characteristics of the project-even after accounting for difference due to test suites-is a significant factor. However, the mean difference between tools is very small, indicating that no single tool consistently skews mutation scores high or low for all projects. These results suggest that experiments yielding small differences in mutation score, especially using a single tool, or a small number of projects may not be reliable. There is a clear need for greater standardization of mutation analysis. We propose one approach for such a standardization.
机译:尽管变异分析是评估测试套件质量的主要方法,但标准化程度不足。变异分析工具会根据语言,产生变异的时间(编译阶段)和目标受众而有所不同。突变工具很少实现文献中提出的完整算子集,而大多数情况下至少实现了一些特定于域的突变算子。因此,不同的工具可能并不总是在测试套件的突变杀死上达成一致。很少有标准可以指导从业人员选择合适的工具来评估测试套件的有效性或比较不同的测试技术。我们调查了用于评估由不同工具产生的突变体的功效的一系列措施。这些包括传统的检测困难,最小集的强度以及突变体的多样性,以及所产生的突变体所携带的信息。我们发现,突变工具很少能达成共识。分数之间的分歧可能很大,并且即使考虑到测试套件造成的差异后,由于项目特征而导致的差异也是一个重要因素。但是,工具之间的平均差异非常小,这表明没有一个工具能够始终使所有项目的突变得分偏高或偏低。这些结果表明,在突变得分上产生小的差异的实验,尤其是使用单个工具,或者少量项目可能不可靠。显然需要对突变分析进行更大程度的标准化。我们提出了一种实现这种标准化的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号