【24h】

Selection of Active Predictors for Misspecified Binary Model

机译:错误指定的二元模型的主动预测变量的选择

获取原文

摘要

Selection of active predictors in high dimensional regression problems plays a pivotal role in contemporary data mining and statistical inference. However, properties of frequently applied selection procedures such as consistent choice of an active set usually strongly rely on assumption that data follows a specific model. In the presentation we address this problem and discuss general setups when estimation procedures can appproximately recover the direction of the true vector of parameters and estimate its support consistently. This explains sometimes observed phenomenon that certain procedures work well even when the underlying data generating mechanism is misspecified; e.g. methods constructed for linear models are applied to binary regression. The basic reasoning was discovered long ago by D. Brillinger and P. Rudd but it is scarcely known in data mining community. As a particular application we introduce a two-stage selection procedure which first screens predictors using LASSO method for logistic regression and then choses the final model via optimization of Generalized Information Criterion on ensuing hierarchical family. We discuss its properties and in particular the fact that in the case of misspecification it picks with large probability a model which approximates Kullback-Leibler projection (in the average sense) onto the family of logistic regressions.
机译:在高维回归问题中选择有效的预测变量在当代数据挖掘和统计推断中起着关键作用。但是,频繁应用的选择过程的属性(例如,对活动集的一致选择)通常强烈依赖于数据遵循特定模型的假设。在演示中,我们解决了这个问题,并讨论了估算程序可以近似恢复参数真实矢量的方向并一致地估算其支持时的常规设置。这就解释了有时观察到的现象,即即使底层数据生成机制指定不正确,某些过程也能很好地工作。例如为线性模型构建的方法应用于二元回归。基本推理是D.Brillinger和P.Rudd早就发现的,但在数据挖掘社区中鲜为人知。作为一个特定的应用程序,我们引入了一个两阶段选择程序,该程序首先使用LASSO方法筛选预测变量以进行逻辑回归,然后根据随后的层次族优化通用信息准则来选择最终模型。我们讨论它的特性,特别是在错误指定的情况下,它很可能会选择一个模型,该模型将Kullback-Leibler投影(平均意义上)近似于逻辑回归族。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号