首页> 外文学位 >Regularized regression methods for variable selection and estimation.
【24h】

Regularized regression methods for variable selection and estimation.

机译:用于变量选择和估计的正则化回归方法。

获取原文
获取原文并翻译 | 示例

摘要

We make two contributions to the body of work on the variable selection and estimation problem. First, we propose a new penalized likelihood procedure---the seamless-L0 (SELO) method---which utilizes a continuous penalty function that closely approximates the discontinuous L0 penalty. The SELO penalized likelihood procedure consistently selects the correct variables and is asymptotically normal, provided the number of variables grows slower than the number of observations. The SELO method is efficiently implemented using a coordinate descent algorithm. Tuning parameter selection is crucial to the performance of the SELO procedure. We propose a BIC-like tuning parameter selection method for SELO which consistently identifies the correct model, even if the number of variables diverges. Simulation results show that the SELO procedure with BIC tuning parameter selection performs very well in a variety of settings---outperforming other popular penalized likelihood procedures by a substantial margin. Using SELO, we analyze a publicly available HIV drug resistance and mutation dataset and obtain interpretable results.;Our second contribution is the development of techniques for estimating equation based variable selection. We use the Dantzig selector, a variable selection and estimation procedure based on the normal score equations, as a template. After deriving new asymptotic results for the Dantzig selector, we propose the adaptive Dantzig selector---an extension of the Dantzig selector which consistently selects the correct variables and is asymptotically normal. We show that the adaptive Dantzig selector outperforms the Dantzig selector in various simulated settings. Finally, we show that the Dantzig selector may be extended to handle many different types of data, provided a reasonable estimating equation is available---a full likelihood model for the data is not necessary. Our generalization of the Dantzig selector for estimating equations has good asymptotic properties, which are similar in flavor to those of the adaptive Dantzig selector. As an example, we consider the application of the Dantzig selector to generalized estimating equations (GEEs). We show that the performance of variable selection and estimation procedures may be improved by using GEEs to account for excess correlation which may be present in the data.
机译:我们对变量选择和估计问题的工作做出了两点贡献。首先,我们提出了一种新的惩罚似然程序-Seamless-L0(SELO)方法,该方法利用了一种连续惩罚函数,该函数近似逼近了不连续的L0惩罚。 SELO惩罚似然程序始终选择正确的变量,并且渐近是正常的,条件是变量的数量增长得比观察值的增长慢。使用坐标下降算法可以有效地实现SELO方法。调整参数选择对于SELO程序的性能至关重要。我们为SELO提出了一种类似于BIC的调整参数选择方法,即使变量数量有所差异,该方法也始终可以识别正确的模型。仿真结果表明,带有BIC调整参数选择的SELO程序在各种设置中均表现出色-大大优于其他流行的惩罚似然程序。使用SELO,我们分析了一个公众可获得的HIV耐药性和突变数据集,并获得了可解释的结果。;我们的第二个贡献是开发了基于方程的变量选择估计技术。我们使用Dantzig选择器作为模板,Dantzig选择器是基于正态得分方程的变量选择和估计过程。在为Dantzig选择器得出新的渐近结果之后,我们提出了自适应Dantzig选择器-Dantzig选择器的扩展,该选择器始终选择正确的变量并且渐近是正常的。我们展示了在各种模拟设置中自适应Dantzig选择器的性能都优于Dantzig选择器。最后,我们证明了Dantzig选择器可以扩展为处理许多不同类型的数据,只要有一个合理的估计方程可用-则不需要数据的完全似然模型。我们对用于估计方程式的Dantzig选择器的概括具有良好的渐近性质,其风味与自适应Dantzig选择器相似。例如,我们考虑将Dantzig选择器应用于广义估计方程(GEE)。我们表明,通过使用GEE来考虑数据中可能存在的过度相关性,可以提高变量选择和估计程序的性能。

著录项

  • 作者

    Dicker, Lee Herbrandson.;

  • 作者单位

    Harvard University.;

  • 授予单位 Harvard University.;
  • 学科 Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号