...
首页> 外文期刊>International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms >Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization
【24h】

Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization

机译:同时具有超参数优化的基于交叉验证的协议的性能评估属性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In a typical supervised data analysis task, one needs to perform the following two tasks: (a) select an optimal combination of learning methods (e.g., for variable selection and classifier) and tune their hyper-parameters (e.g., K in K-NN), also called model selection, and (b) provide an estimate of the performance of the final, reported model. Combining the two tasks is not trivial because when one selects the set of hyper-parameters that seem to provide the best estimated performance, this estimation is optimistic (biased/overfitted) due to performing multiple statistical comparisons. In this paper, we discuss the theoretical properties of performance estimation when model selection is present and we confirm that the simple Cross-Validation with model selection is indeed optimistic (overestimates performance) in small sample scenarios and should be avoided. We present in detail and investigate the theoretical properties of the Nested Cross Validation and a method by Tibshirani and Tibshirani for removing the estimation bias. In computational experiments with real datasets both protocols provide conservative estimation of performance and should be preferred. These statements hold true even if feature selection is performed as preprocessing.
机译:在典型的监督数据分析任务中,需要执行以下两项任务:(a)选择学习方法的最佳组合(例如,用于变量选择和分类器)并调整其超参数(例如,K-NN中的K) )(也称为模型选择),以及(b)提供最终报告模型的性能估算。合并这两个任务并非易事,因为当选择一组似乎提供最佳估计性能的超参数时,由于执行了多个统计比较,因此这种估计是乐观的(有偏/过拟合)。在本文中,我们讨论了存在模型选择时性能估计的理论属性,并且我们确认,在小样本场景中,简单的带有模型选择的交叉验证确实是乐观的(高估了性能),应避免使用。我们详细介绍并研究了嵌套交叉验证的理论性质以及Tibshirani和Tibshirani消除估计偏差的方法。在具有实际数据集的计算实验中,这两种协议都提供了性能的保守估计,应该优先考虑。即使将特征选择作为预处理执行,这些语句也适用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号