Comparison of Bayesian predictive methods for model selection

Piironen Juho; Vehtari Aki

首页> 外文期刊>Statistics and computing >Comparison of Bayesian predictive methods for model selection

【24h】

Comparison of Bayesian predictive methods for model selection

机译：贝叶斯预测模型选择方法的比较

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.

机译：本文的目的是在实际的模型选择问题中比较几种广泛使用的贝叶斯模型选择方法，突出它们的差异并就首选方法提供建议。我们专注于变量子集的选择，以进行回归和分类，并使用模拟数据和真实数据进行一些数值实验。结果表明，由于缺乏数据时效用估计值中的相对较高方差，效用估计值（如交叉验证（CV）分数）的优化很容易找到过度拟合的模型。这也可能导致在选择模型的性能评估中，大量选择引起偏见和乐观。从预测角度来看，通过形成完整的包围模型（例如候选模型上的贝叶斯模型平均解决方案）来考虑模型不确定性，可以获得最佳结果。如果包含模型太复杂，则可以通过投影方法来稳健地进行简化，其中将完整模型的信息投影到子模型上。与基于CV评分的选择相比，这种方法明显不太适合过度拟合。总的来说，投影方法似乎也胜过最大的后验模型和最可能变量的选择。这项研究还表明，在搜索过程之外使用交叉验证不仅可以指导模型大小的选择，而且可以评估最终选择模型的预测性能，因此可以大大受益于模型选择。

著录项

来源
《Statistics and computing》 |2017年第3期|711-735|共25页
作者
Piironen Juho; Vehtari Aki;
展开▼
作者单位

Aalto Univ, Helsinki Inst Informat Technol HIIT, Dept Comp Sci, Espoo, Finland;

Aalto Univ, Helsinki Inst Informat Technol HIIT, Dept Comp Sci, Espoo, Finland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Bayesian model selection; Cross-validation; Reference model; Projection; Selection bias;

机译：贝叶斯模型选择;交叉验证;参考模型;投影;选择偏差;

相似文献

外文文献
中文文献
专利

1. Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. [J] . Wang D, Zhang W, Bakhai A Statistics in medicine . 2004,第22期

机译：Logistic回归中贝叶斯模型平均和逐步选择模型的比较。
2. Comparison of variable selection methods in predictive models applied to near-infrared and genomic data [J] . R.A.Ferreira, L.A.Peternelli Genetics and Molecular Research . 2021,第3期

机译：在近红外和基因组数据应用预测模型中的可变选择方法的比较
3. Comparison of variable selection methods for clinical predictive modeling [J] . Sanchez-Pinto L. Nelson, Venable Laura Ruth, Fahrenbach John, International journal of medical informatics . 2018,第AUGa期

机译：用于临床预测建模的变量选择方法的比较
4. A Comparison of Feature Selection Methods to Optimize Predictive Models Based on Decision Forest Algorithms for Academic Data Analysis [C] . Antonio Jesus Fernández-García, Luis Iribarne, Antonio Corral, World Conference on Information Systems and Technologies . 2018

机译：基于决策林算法优化预测模型的特征选择方法对学术数据分析的比较
5. Bayesian methods to characterize uncertainty in predictive modeling of the effect of urbanization on aquatic ecosystems. [D] . Kashuba, Roxolana Oresta. 2010

机译：贝叶斯方法表征城市化对水生生态系统影响的预测模型中的不确定性。
6. A Comparison of Bayesian and Frequentist Model Selection Methods for Factor Analysis Models [O] . Zhao-Hua Lu, Sy-Miin Chow, Eric Loken -1

机译：因素分析模型的贝叶斯模型和频率模型选择方法的比较
7. Comparison of Bayesian predictive methods for model selection [O] . Piironen, Juho, Vehtari, Aki 2017

机译：贝叶斯预测模型选择方法的比较
8. Comparison of Two Gas Selection Methodologies: An Application of Bayesian Model Averaging [R] . Reynolds, A. S., Thompson, S. E., Anderson, K. K., 2006

机译：两种气体选择方法的比较：贝叶斯模型平均的应用

Comparison of Bayesian predictive methods for model selection

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅