...
首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >Classifier performance estimation under the constraint of a finite sample size: resampling schemes applied to neural network classifiers.
【24h】

Classifier performance estimation under the constraint of a finite sample size: resampling schemes applied to neural network classifiers.

机译:在有限样本量约束下的分类器性能估计:应用于神经网络分类器的重采样方案。

获取原文
获取原文并翻译 | 示例
           

摘要

In a practical classifier design problem the sample size is limited, and the available finite sample needs to be used both to design a classifier and to predict the classifier's performance for the true population. Since a larger sample is more representative of the population, it is advantageous to design the classifier with all the available cases, and to use a resampling technique for performance prediction. We conducted a Monte Carlo simulation study to compare the ability of different resampling techniques in predicting the performance of a neural network (NN) classifier designed with the available sample. We used the area under the receiver operating characteristic curve as the performance index for the NN classifier. We investigated resampling techniques based on the cross-validation, the leave-one-out method, and three different types of bootstrapping, namely, the ordinary, .632, and .632+ bootstrap. Our results indicated that, under the study conditions, there can be a large difference in the accuracy of the prediction obtained from different resampling methods, especially when the feature space dimensionality is relatively large and the sample size is small. Although this investigation is performed under some specific conditions, it reveals important trends for the problem of classifier performance prediction under the constraint of a limited data set.
机译:在实际的分类器设计问题中,样本量是有限的,并且可用的有限样本既需要用于设计分类器,又需要预测分类器对真实总体的性能。由于较大的样本更能代表总体,因此设计具有所有可用情况的分类器,并使用重采样技术进行性能预测是有利的。我们进行了蒙特卡洛模拟研究,比较了不同重采样技术在预测使用可用样本设计的神经网络(NN)分类器性能方面的能力。我们将接收器工作特性曲线下的面积用作NN分类器的性能指标。我们研究了基于交叉验证,留一法和三种不同类型的引导程序(即普通引导程序,.632和.632+引导程序)的重采样技术。我们的结果表明,在研究条件下,通过不同的重采样方法获得的预测准确性可能存在较大差异,尤其是当特征空间维数较大且样本量较小时。尽管此调查是在某些特定条件下进行的,但它揭示了在有限数据集约束下分类器性能预测问题的重要趋势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号