Discrepancy-based algorithms for best-subset model selection

机译：基于差异的最佳子集模型选择的算法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables.Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection.In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallowsu27 conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi.In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder.

机译：从候选家庭的最佳子集回归模型的选择，就是当许多分析一个共同的问题。在最好的子集模型的选择，我们认为回归变量的所有可能的子集;因此，众多的候选机型可能需要适应和比较。其中一个最好的子集选择的主要挑战来源于候选模型家庭的大小：具体而言，选择不恰当的模型的概率一般随着家庭的大小增加。出于这个原因，它通常是很难选择时的最佳子集选择是基于一个中度至大量的回归variables.Model选择标准试图进行优化模型通常构造成估计用于评估每种拟合候选之间的差异的差异的措施模型和生成模型。 Akaike信息准则（AIC）和校正后的AIC（AICC）被设计来估计预期的Kullback-Leibler距离（K-L）的差异。为了获得最佳的子集选择，AIC及AICC两者都为负偏压，并利用这两项标准将导致过度拟合模型。为了纠正这种偏差，我们引入一个标准AICI，其中有从蒙特卡罗模拟评估惩罚项。一种多级模型选择过程AICaps，其利用AICI，提出了一种用于线性回归模型的最佳子集selection.In框架内，该高斯差异是拟合候选模型和所述生成模型之间的接近的另一频繁施加量度。锦葵 U27概念预测统计（CP）和改性的Cp（MCP）被设计来估计预期高斯差异。为了获得最佳的子集选择，Cp和MCP呈现负估计偏差。为了纠正这种偏差，我们提出了一个标准CPSI一遍采用蒙特卡罗模拟计算惩罚项。我们进一步制定多级程序，CPSaps，有选择地利用CPSi.In本文中，我们考虑两种不同造型的框架最好的子集选择：线性模型和广义线性模型。大量的仿真研究，被编译成比较我们的方法和其它传统模式的选择标准选择行为。我们还运用我们的方法模型选择问题，双相情感障碍的研究。

著录项

作者
Tao Zhang;
展开▼
作者单位

展开▼
年度 -1
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. A multistage algorithm for best-subset model selection based on the Kullback-Leibler discrepancy [J] . Zhang Tao, Cavanaugh Joseph E. Computational statistics . 2016,第2期

机译：基于Kullback-Leibler差异的最佳子集模型选择多阶段算法
2. Branch-and-bound algorithms for computing the best-subset regression models [J] . Gatu C, Kontoghiorghes EJ Journal of computational and graphical statistics: A joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America . 2006,第1期

机译：用于计算最佳子集回归模型的分支定界算法
3. How perceived variety impacts on choice satisfaction: a two-step approach using the CUB class of models and best-subset variable selection [J] . Marica Manisera, Paola Zuccolotto, Eugenio Brentari Electronic Journal of Applied Statistical Analysis . 2020,第2期

机译：如何感知对选择满意度的各种影响：使用幼崽类模型和最佳子集变量选择的两步方法
4. Information theoretic discrepancy-based iterative reconstruction (IDIR) algorithm for limited angle tomography [C] . Kwang Eun Jang, Jongha Lee, Kangui Lee, Conference on physics of medical imaging . 2012

机译：有限角度层析成像的基于信息理论差异的迭代重建（IDIR）算法
5. Discrepancy-based model selection criteria using cross validation. [D] . Davies, Simon Lee. 2002

机译：使用交叉验证的基于差异的模型选择标准。
6. A polynomial algorithm for best-subset selection problem [O] . Junxian Zhu, Canhong Wen, Jin Zhu, 2020

机译：最佳子集选择问题的多项式算法
7. BEST-SUBSET SELECTION PROCEDURE [O] . S. Jain, R. R. Creasey, J. Himmelspach, 2014

机译：最佳子集选择程序
8. Sub-Circuit Selection and Replacement Algorithms Modeled as Term Rewriting Systems [R] . Simonaire, E. D. 2008

机译：子模式选择和替换算法建模为术语重写系统

Discrepancy-based algorithms for best-subset model selection

摘要

著录项

相似文献

相关主题

期刊订阅