首页> 外文会议>Conference on Neural Information Processing Systems >Verified Uncertainty Calibration

【24h】

Verified Uncertainty Calibration

机译：验证不确定性校准

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Applications such as weather forecasting and personalized medicine demand models that output calibrated probability estimates--those representative of the true likelihood of a prediction. Most models are not calibrated out of the box but are recalibrated by post-processing model outputs. We find in this work that popular recalibration methods like Platt scaling and temperature scaling are (i) less calibrated than reported, and (ii) current techniques cannot estimate how miscalibrated they are. An alternative method, histogram binning, has measurable calibration error but is sample inefficient--it requires O(B/ε~2) samples, compared to O(1/ε~2) for scaling methods, where B is the number of distinct probabilities the model can output. To get the best of both worlds, we introduce the scaling-binning calibrator, which first fits a parametric function to reduce variance and then bins the function values to actually ensure calibration. This requires only O(1/ε~2 + B) samples. Next, we show that we can estimate a model's calibration error more accurately using an estimator from the meteorological community--or equivalently measure its calibration error with fewer samples (O(B~(1/2)) instead of O(B)). We validate our approach with multiclass calibration experiments on CIFAR-10 and ImageNet, where we obtain a 35% lower calibration error than histogram binning and, unlike scaling methods, guarantees on true calibration. We implement all these methods in a Python library: https://pypi.org/project/uncertainty-calibration

机译：诸如天气预报和个性化药物需求模式的应用，该模型输出校准概率估计 - 那些代表预测的真正可能性的概率。大多数型号都不在框中校准，但通过后处理模型输出进行重新校验。我们在这项工作中找到了Platt缩放和温度缩放等流行重新校准的方法（i）比报告的校准更少，并且（ii）目前的技术无法估计它们的错误频繁。替代方法，直方图盒，具有可测量的校准误差，但是样本效率低 - 它需要o（b /ε〜2）样本，与缩放方法的o（1 /ε〜2）相比，其中b是不同的数量概率模型可以输出。为了充分利用这两个世界，我们介绍了缩放融合校准器，首先适合参数函数来减少方差，然后将功能值置于实际确保校准。这仅需要O（1 /ε〜2 + B）样品。接下来，我们表明我们可以使用来自气象群落的估算器更准确地估计模型的校准误差 - 或者使用较少的样品（O（B〜（1/2））而不是O（B））而等于其校准误差。我们在CiFar-10和ImageNet上验证了我们的方法，在其中，我们获得的校准误差低于直方图箱，而不是缩放方法，而不是缩放方法，保证真正校准的校准误差。我们在Python库中实现所有这些方法：https://pypi.org/project/uncderainty-calibration

著录项

来源
《Conference on Neural Information Processing Systems 》|2020年|p3179-3968|共12页
会议地点
作者
Ananya Kumar; Percy Liang; Tengyu Ma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计量学 ;
关键词

相似文献

外文文献
中文文献
专利

1. Uncertainty sources analysis of a calibration system for the accuracy vs. temperature verification of voltage transformers [J] . Alessandro Mingotti, Lorenzo Peretto, Roberto Tinarelli, Journal of Physics: Conference Series . 2018 ,第5期

机译：校准系统的不确定度来源分析，用于电压互感器的精度与温度验证
2. Mitigating Error and Uncertainty in Partitioned Analysis: A Review of Verification, Calibration and Validation Methods for Coupled Simulations [J] . Stevens Garrison, Atamturktur Sez Archives of Computational Methods in Engineering . 2017 ,第3期

机译：减轻分区分析中的错误和不确定性：耦合模拟的验证，校准和验证方法的回顾
3. Integration of model verification, validation, and calibration for uncertainty quantification in engineering systems [J] . Sankararaman Shankar, Mahadevanb Sankaran Reliability Engineering & System Safety . 2015 ,第juna期

机译：模型验证，验证和校准的集成，可用于工程系统中的不确定性量化
4. Verified Uncertainty Calibration [C] . Ananya Kumar, Percy Liang, Tengyu Ma Conference on Neural Information Processing Systems . 2020

机译：验证不确定性校准
5. Uncertainty of Stereo PIV Calibration and Self-Calibration. [D] . Williams, Braydon J. 2017

机译：立体声PIV校准和自我校准的不确定性。
6. Dielectric Properties of Glass Beads with Talc as a Reference Material for Calibration and Verification of Dielectric Methods and Devices for Measuring Soil Moisture [O] . Justyna Szerement, Hironobu Saito, Kahori Furuhata, 2020

机译：以滑石粉为参考材料的玻璃微珠的介电特性用于介电方法和土壤水分测量装置的校准和验证
7. Evaluation of the measurement uncertainty at calibration of devices for electrical energy and power meters verification [O] . Андрей Николаевич Попенака, Александр Иванович Колбасин, Наталья Михайловна Маслова 2017

机译：评估电能和功率计验证装置校准测量不确定性

Verified Uncertainty Calibration

摘要

著录项

相似文献

相关主题

期刊订阅