Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing

机译：在机器学习代码中显示错误：具有突变测试的探索性研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays statistical machine learning is widely adopted in various domains such as data mining, image recognition and automated driving. However, software quality assurance for machine learning is still in its infancy. While recent efforts have been put into improving the quality of training data and trained models, this paper focuses on code-level bugs in the implementations of machine learning algorithms. In this explorative study we simulated program bugs by mutating Weka implementations of several classification algorithms. We observed that 8%-40% of the logically non-equivalent executable mutants were statistically indistinguishable from their golden versions. Moreover, other 15%-36% of the mutants were stubborn, as they performed not significantly worse than a reference classifier on at least one natural data set. We also experimented with several approaches to killing those stubborn mutants. Preliminary results indicate that bugs in machine learning code may have negative impacts on statistical properties such as robustness and learning curves, but they could be very difficult to detect, due to the lack of effective oracles.

机译：如今，统计机器学习广泛采用数据挖掘，图像识别和自动驾驶等各个领域。然而，机器学习的软件质量保证仍处于初期阶段。虽然最近的努力已经进入提高培训数据和培训的型号的质量，但本文重点介绍了机器学习算法实现中的代码级错误。在这个探索性研究中，我们通过突变多个分类算法的Weka实现来模拟程序错误。我们观察到，8 ％ - 40 ％的逻辑上不等效的可执行突变体与他们的金色版本统计上无法区分。此外，其他15 ％-36 ％的突变体顽固，因为它们在至少一个自然数据集上的参考分类器中执行不显着差。我们还尝试了几种杀害这些顽固突变体的方法。初步结果表明，机器学习代码中的错误可能对统计属性（例如鲁棒性和学习曲线）产生负面影响，但由于缺乏有效的令人讨厌，它们可能非常难以检测。

著录项

来源
《IEEE International Conference on Software Quality, Reliability, and Security》|2018年|515p|共12页
会议地点
作者
Dawei Cheng; Chun Cao; Chang Xu; Xiaoxing Ma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Computer bugs; Testing; Machine learning; Training data; Prediction algorithms; Support vector machines; Machine learning algorithms;

机译：计算机错误;测试;机器学习;培训数据;预测算法;支持向量机;机器学习算法;

相似文献

外文文献
中文文献
专利

1. Machine Learning Based Prediction of Complex Bugs in Source Code [J] . Uqaili Ishrat-Un-Nisa, Ahsan Syed Nadeem The international arab journal of information technology . 2020,第1期

机译：基于机器学习的源代码复杂错误预测
2. Machine learning classifier for identification of damaging missense mutations exclusive to human mitochondrial DNA-encoded polypeptides [J] . Antonio Martín-Navarro, Andrés Gaudioso-Simón, Jorge álvarez-Jarreta, BMC Bioinformatics . 2017,第1期

机译：机器学习分类器，用于识别人类线粒体DNA编码多肽专有的破坏性错义突变
3. An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation [J] . Tufano Michele, Watson Cody, Bavota Gabriele, ACM transactions on software engineering and methodology . 2019,第4期

机译：通过神经机器翻译学习野外错误修复补丁的实证研究
4. Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing [C] . Dawei Cheng, Chun Cao, Chang Xu, 2018 IEEE 18th International Conference on Software Quality, Reliability, and Security . 2018

机译：体现机器学习代码中的错误：突变测试的探索性研究
5. Integrated Machine Learning and Bioinformatics Approaches for Prediction of Cancer-driving Gene Mutations [D] . Odeyemi, Oluyemi. 2020

机译：综合机学习和生物信息学预测癌症驾驶基因突变的方法
6. Machine-learning model led design to experimentally test species thermal limits: The case of kissing bugs (Triatominae) [O] . Jorge E. Rabinovich, Agustín Alvarez Costa, Ignacio J. Muñoz, 2021

机译：机器学习模型LED设计以通过实验测试物种热限装：亲吻虫子的情况（Triatominae）
7. Machine-learning model led design to experimentally test species thermal limits: The case of kissing bugs (Triatominae) [O] . Jorge E. Rabinovich, Agustín Alvarez Costa, Ignacio J. Muñoz, 2021

机译：机器学习模型LED设计以通过实验测试物种热限装：亲吻虫子的情况（Triatominae）

Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing

摘要

著录项

相似文献

相关主题

期刊订阅