首页> 外文OA文献 >Discrepancy-based algorithms for best-subset model selection
【2h】

Discrepancy-based algorithms for best-subset model selection

机译:基于差异的最佳子集模型选择的算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables.Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection.In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallowsu27 conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi.In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder.
机译:从候选家庭的最佳子集回归模型的选择,就是当许多分析一个共同的问题。在最好的子集模型的选择,我们认为回归变量的所有可能的子集;因此,众多的候选机型可能需要适应和比较。其中一个最好的子集选择的主要挑战来源于候选模型家庭的大小:具体而言,选择不恰当的模型的概率一般随着家庭的大小增加。出于这个原因,它通常是很难选择时的最佳子集选择是基于一个中度至大量的回归variables.Model选择标准试图进行优化模型通常构造成估计用于评估每种拟合候选之间的差异的差异的措施模型和生成模型。 Akaike信息准则(AIC)和校正后的AIC(AICC)被设计来估计预期的Kullback-Leibler距离(K-L)的差异。为了获得最佳的子集选择,AIC及AICC两者都为负偏压,并利用这两项标准将导致过度拟合模型。为了纠正这种偏差,我们引入一个标准AICI,其中有从蒙特卡罗模拟评估惩罚项。一种多级模型选择过程AICaps,其利用AICI,提出了一种用于线性回归模型的最佳子集selection.In框架内,该高斯差异是拟合候选模型和所述生成模型之间的接近的另一频繁施加量度。锦葵 U27概念预测统计(CP)和改性的Cp(MCP)被设计来估计预期高斯差异。为了获得最佳的子集选择,Cp和MCP呈现负估计偏差。为了纠正这种偏差,我们提出了一个标准CPSI一遍采用蒙特卡罗模拟计算惩罚项。我们进一步制定多级程序,CPSaps,有选择地利用CPSi.In本文中,我们考虑两种不同造型的框架最好的子集选择:线性模型和广义线性模型。大量的仿真研究,被编译成比较我们的方法和其它传统模式的选择标准选择行为。我们还运用我们的方法模型选择问题,双相情感障碍的研究。

著录项

  • 作者

    Tao Zhang;

  • 作者单位
  • 年度 -1
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号