首页> 外文会议>IEEE International Conference on Software Quality, Reliability, and Security >Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing
【24h】

Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing

机译:在机器学习代码中显示错误:具有突变测试的探索性研究

获取原文

摘要

Nowadays statistical machine learning is widely adopted in various domains such as data mining, image recognition and automated driving. However, software quality assurance for machine learning is still in its infancy. While recent efforts have been put into improving the quality of training data and trained models, this paper focuses on code-level bugs in the implementations of machine learning algorithms. In this explorative study we simulated program bugs by mutating Weka implementations of several classification algorithms. We observed that 8%-40% of the logically non-equivalent executable mutants were statistically indistinguishable from their golden versions. Moreover, other 15%-36% of the mutants were stubborn, as they performed not significantly worse than a reference classifier on at least one natural data set. We also experimented with several approaches to killing those stubborn mutants. Preliminary results indicate that bugs in machine learning code may have negative impacts on statistical properties such as robustness and learning curves, but they could be very difficult to detect, due to the lack of effective oracles.
机译:如今,统计机器学习广泛采用数据挖掘,图像识别和自动驾驶等各个领域。然而,机器学习的软件质量保证仍处于初期阶段。虽然最近的努力已经进入提高培训数据和培训的型号的质量,但本文重点介绍了机器学习算法实现中的代码级错误。在这个探索性研究中,我们通过突变多个分类算法的Weka实现来模拟程序错误。我们观察到,8 % - 40 %的逻辑上不等效的可执行突变体与他们的金色版本统计上无法区分。此外,其他15 %-36 %的突变体顽固,因为它们在至少一个自然数据集上的参考分类器中执行不显着差。我们还尝试了几种杀害这些顽固突变体的方法。初步结果表明,机器学习代码中的错误可能对统计属性(例如鲁棒性和学习曲线)产生负面影响,但由于缺乏有效的令人讨厌,它们可能非常难以检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号