首页> 外文会议>IEEE/ACM International Conference on Automated Software Engineering >Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness
【24h】

Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness

机译:猫不是鱼:深入学习测试呼吁分发的意识

获取原文

摘要

As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the “identified errors” are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%.
机译:由于深度学习(DL)在许多工业应用中不断采用,其质量和可靠性开始提高担忧。类似于传统的软件开发过程,测试DL软件在早期阶段发现其缺陷是减少部署后的风险的有效方法。根据深度学习的根本假设,DL软件没有提供统计保障,并在处理其所学习分配之外的数据,即分销(OOD)数据外的数据具有有限的能力。尽管在设计DL软件的新型测试技术方面取得了最近的进展,但是可以检测数千次错误,目前的最先进的DL测试技术通常不会考虑所产生的测试数据的分布。因此,难以判断“识别的错误”是否对DL应用程序的错误是有意义的错误(即,由于模型的质量问题)或无法由当前模型处理的异常值(即,由于缺乏培训数据)。 Tofill这个差距,我们采取了Fi RST步骤并进行了大规模的实证研究,共有451个实验配置,42个深度神经网络(DNN)和120万个测试数据实例,调查和表征ood认识的影响DL测试。当DL Systems通过评估对分布感知错误的逆转刷新的有效性时,我们进一步分析了后果。结果证实,在测试和增强阶段引入数据分布意识优越,不知不行高达21.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号