首页> 外文会议>IEEE/ACM International Conference on Software Engineering >Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions

【24h】

Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions

机译：操作是最难的老师：估计DNN精度寻找错误预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep Neural Networks (DNN) are typically tested for accuracy relying on a set of unlabelled real world data (operational dataset), from which a subset is selected, manually labelled and used as test suite. This subset is required to be small (due to manual labelling cost) yet to faithfully represent the operational context, with the resulting test suite containing roughly the same proportion of examples causing misprediction (i.e., failing test cases) as the operational dataset. However, while testing to estimate accuracy, it is desirable to also learn as much as possible from the failing tests in the operational dataset, since they inform about possible bugs of the DNN. A smart sampling strategy may allow to intentionally include in the test suite many examples causing misprediction, thus providing this way more valuable inputs for DNN improvement while preserving the ability to get trustworthy unbiased estimates. This paper presents a test selection technique (DeepEST) that actively looks for failing test cases in the operational dataset of a DNN, with the goal of assessing the DNN expected accuracy by a small and “informative” test suite (namely with a high number of mispredictions) for subsequent DNN improvement. Experiments with five subjects, combining four DNN models and three datasets, are described. The results show that DeepEST provides DNN accuracy estimates with precision close to (and often better than) those of existing sampling-based DNN testing techniques, while detecting from 5 to 30 times more mispredictions, with the same test suite size.

机译：深度神经网络（DNN）通常经过测试以依赖于一组未标记的真实世界数据（操作数据集），从中选择，从中选择，手动标记并用作测试套件。该子集需要小（由于手动标记成本）尚未忠实地代表操作环境，得到的测试套件包含大致相同的示例比例，导致错误公平（即，失败的测试用例）作为操作数据集。然而，在测试估计准确度的同时，希望从操作数据集中的故障测试中尽可能多地学习，因为它们会通知DNN可能的错误。智能采样策略可以允许故意包括在测试套件中的许多示例导致错误规定，从而为DNN改进提供了更有价值的输入，同时保留了获得值得信赖的无偏估计的能力。本文提出了一种测试选择技术（最深），积极寻找DNN的运营数据集中的测试用例，其目标是通过小型和“信息性”测试套件（即大数量）评估DNN预期准确性的目标误像性的是随后的DNN改进。描述了五个受试者的实验，结合四个DNN模型和三个数据集。结果表明，最深的是DNN精度估计，精度接近（通常优于基于采样的DNN测试技术的精度，同时检测到更短的错误预测5到30倍，具有相同的测试套件大小。

著录项

来源
《IEEE/ACM International Conference on Software Engineering 》|2021年|348-358|共11页
会议地点
作者
Antonio Guerriero; Roberto Pietrantuono; Stefano Russo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Adaptation models; Neural networks; Computer bugs; Software; Labeling; Testing; Software engineering;

机译：适应模型;神经网络;计算机错误;软件;标签;测试;软件工程;

相似文献

外文文献
中文文献
专利

1. The Misprediction of Emotions in Track Athletics: Is Experience the Teacher of all Things? [J] . Wilco W. van Dijk, Catrin Finkenauer, Monique Pollmann Basic and Applied Social Psychology . 2008 ,第4期

机译：田径运动中的情感误区：经验是万物的老师吗？
2. Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via Deep Reinforcement Learning [J] . Wu Wen, Yang Peng, Zhang Weiting, IEEE transactions on industrial informatics . 2021 ,第7期

机译：通过深度加强学习，精确保证工业物联网的协作DNN推断
3. Efficiency Versus Accuracy: A Review of Design Techniques for DNN Hardware Accelerators [J] . Cecilia Latotzke, Tobias Gemmeke Quality Control, Transactions . 2021 ,第1期

机译：效率与准确性：DNN硬件加速器设计技术综述
4. Deriving an Estimated Time of Arrival Accuracy Requirement for Departure Scheduling Operations [C] . Lesley A. Weitz, Brock J. Lascara, Stephanie Priess AIAA SciTech Forum and Exposition . 2021

机译：导出估计的到达时间准确性要求进行出发调度操作
5. The Accuracy of Accuracy Estimates for Single Form Dichotomous Classification Exams. [D] . Kunze, Katie. 2013

机译：单一形式二分分类考试的准确性估计值的准确性。
6. Estimates and Determinants of SARS-Cov-2 Seroprevalence and Infection Fatality Ratio Using Latent Class Analysis: The Population-Based Tirschenreuth Study in the Hardest-Hit German County in Spring 2020 [O] . Ralf Wagner, David Peterhoff, Stephanie Beileke, 2021

机译：使用潜在阶级分析的SARS-COV-2 Seroprengencess和感染死亡率的估算和决定因子：2020年春季最难以达到德国县的人口的蒂拉瑟莱特研究
7. Teachers' Accuracy in Estimating Social Inclusion of Students With and Without Special Educational Needs [O] . Jürgen Wilbert, Karolina Urton, Johanna Krull, 2020

机译：教师准确估计学生社会包容，没有特殊教育需求
8. Accuracy of Teachers' Self-reports on their Postsecondary Education: Teacher Transcript Study, Schools and Staffing Survey. Working Paper Series [R] . Chaney, B. 1993

机译：教师自我报告的高等教育准确性：教师成绩单，学校和人员调查。工作论文系列

Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions

摘要

著录项

相似文献

相关主题

期刊订阅