首页> 外文期刊>Statistics and computing >Comparison of Bayesian predictive methods for model selection
【24h】

Comparison of Bayesian predictive methods for model selection

机译:贝叶斯预测模型选择方法的比较

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.
机译:本文的目的是在实际的模型选择问题中比较几种广泛使用的贝叶斯模型选择方法,突出它们的差异并就首选方法提供建议。我们专注于变量子集的选择,以进行回归和分类,并使用模拟数据和真实数据进行一些数值实验。结果表明,由于缺乏数据时效用估计值中的相对较高方差,效用估计值(如交叉验证(CV)分数)的优化很容易找到过度拟合的模型。这也可能导致在选择模型的性能评估中,大量选择引起偏见和乐观。从预测角度来看,通过形成完整的包围模型(例如候选模型上的贝叶斯模型平均解决方案)来考虑模型不确定性,可以获得最佳结果。如果包含模型太复杂,则可以通过投影方法来稳健地进行简化,其中将完整模型的信息投影到子模型上。与基于CV评分的选择相比,这种方法明显不太适合过度拟合。总的来说,投影方法似乎也胜过最大的后验模型和最可能变量的选择。这项研究还表明,在搜索过程之外使用交叉验证不仅可以指导模型大小的选择,而且可以评估最终选择模型的预测性能,因此可以大大受益于模型选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号