The reliability of estimated confidence intervals for classification error rates when only a single sample is available

Hanczar, B.; Dougherty, E.R.

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >The reliability of estimated confidence intervals for classification error rates when only a single sample is available

【24h】

The reliability of estimated confidence intervals for classification error rates when only a single sample is available

机译：仅提供一个样本时，分类错误率的估计置信区间的可靠性

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Error estimation accuracy is the salient issue regarding the validity of a classifier model. When samples are small, training-data-based error estimates tend to suffer from inaccuracy and quantification of error estimation accuracy is difficult. Numerous methods have been proposed for estimating confidence intervals for the true error based on the estimated error. This paper surveys proposed methods and quantifies their performance. Monte Carlo methods are used to obtain accurate estimates of the true confidence intervals and compare these to the intervals estimated from samples. We consider different error estimators and several proposed confidence-bound estimators. Both synthetic and real genomic data are employed. Our simulations show the majority of the confidence intervals methods have poor performance because of the difference of shape between true and estimated intervals. According to our results, the best estimation strategy is to use the 10-time 10-fold cross-validation with a confidence interval based on the standard deviation.

机译：误差估计的准确性是关于分类器模型有效性的显着问题。当样本较小时，基于训练数据的误差估计往往会出现误差，并且难以量化误差估计的准确性。已经提出了许多方法来基于估计的误差来估计真实误差的置信区间。本文调查了提出的方法并量化了它们的性能。蒙特卡罗方法用于获得真实置信区间的准确估计，并将其与从样本估计的区间进行比较。我们考虑不同的误差估计量和几个建议的置信度估计量。合成的和真实的基因组数据都被采用。我们的仿真表明，由于真实区间和估计区间之间的形状差异，大多数置信区间方法的性能较差。根据我们的结果，最好的估计策略是使用基于标准偏差的置信区间的10次10倍交叉验证。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2013年第3期|共11页
作者
Hanczar, B.; Dougherty, E.R.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Confidence interval; Error estimation; High dimension; Small sample setting; Supervised learning;

机译：置信区间;误差估计;高维;小样本设置;监督学习;

相似文献

外文文献
中文文献
专利

1. The reliability of estimated confidence intervals for classification error rates when only a single sample is available [J] . Hanczar, B., Dougherty, E.R. Pattern Recognition: The Journal of the Pattern Recognition Society . 2013,第3期

机译：仅提供一个样本时，分类错误率的估计置信区间的可靠性
2. Estimating confidence interval of software reliability with adaptive testing strategy [J] . Junpeng Lv, Bei-Bei Yin, Kai-Yuan Cai The Journal of Systems and Software . 2014,第nova期

机译：利用自适应测试策略估计软件可靠性的置信区间
3. Conditional confidence intervals for classification error rate [J] . Chung HC, Han CP Computational statistics & data analysis . 2009,第12期

机译：分类错误率的条件置信区间
4. ESTIMATING COMPLETION RATES FROM SMALL SAMPLES USING BINOMIAL CONFIDENCE INTERVALS: COMPARISONS AND RECOMMENDATIONS [C] . Jeff Sauro, James R. Lewis Human Factors and Ergonomics Society annual meeting . 2005

机译：使用二项式置信区间估计小样本的完成率：比较和建议
5. Comparing the overlapping of two independent confidence intervals with a single confidence interval for two normal population parameters. [D] . Huang, Ching-Ying. 2008

机译：比较两个正常总体参数的两个独立置信区间与单个置信区间的重叠。
6. Standard errors and confidence intervals for variable importance in random forest regression classification and survival [O] . Hemant Ishwaran, Min Lu -1

机译：在随机森林回归分类和生存中变量重要性的标准误差和置信区间
7. Estimating completion rates from small samples using binomial confidence intervals: Comparisons and recommendations [O] . Jeff Sauro, James R. Lewis 2005

机译：使用二项式置信区间估算小样本的完成率：比较和建议

The reliability of estimated confidence intervals for classification error rates when only a single sample is available

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅