首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >The reliability of estimated confidence intervals for classification error rates when only a single sample is available
【24h】

The reliability of estimated confidence intervals for classification error rates when only a single sample is available

机译:仅提供一个样本时,分类错误率的估计置信区间的可靠性

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Error estimation accuracy is the salient issue regarding the validity of a classifier model. When samples are small, training-data-based error estimates tend to suffer from inaccuracy and quantification of error estimation accuracy is difficult. Numerous methods have been proposed for estimating confidence intervals for the true error based on the estimated error. This paper surveys proposed methods and quantifies their performance. Monte Carlo methods are used to obtain accurate estimates of the true confidence intervals and compare these to the intervals estimated from samples. We consider different error estimators and several proposed confidence-bound estimators. Both synthetic and real genomic data are employed. Our simulations show the majority of the confidence intervals methods have poor performance because of the difference of shape between true and estimated intervals. According to our results, the best estimation strategy is to use the 10-time 10-fold cross-validation with a confidence interval based on the standard deviation.
机译:误差估计的准确性是关于分类器模型有效性的显着问题。当样本较小时,基于训练数据的误差估计往往会出现误差,并且难以量化误差估计的准确性。已经提出了许多方法来基于估计的误差来估计真实误差的置信区间。本文调查了提出的方法并量化了它们的性能。蒙特卡罗方法用于获得真实置信区间的准确估计,并将其与从样本估计的区间进行比较。我们考虑不同的误差估计量和几个建议的置信度估计量。合成的和真实的基因组数据都被采用。我们的仿真表明,由于真实区间和估计区间之间的形状差异,大多数置信区间方法的性能较差。根据我们的结果,最好的估计策略是使用基于标准偏差的置信区间的10次10​​倍交叉验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号