...
【24h】

Influence Diagnostics for High-Dimensional Lasso Regression

机译:影响高维拉索回归的诊断

获取原文
获取原文并翻译 | 示例
           

摘要

The increased availability of high-dimensional data, and appeal of a "sparse" solution has made penalized likelihood methods commonplace. Arguably the most widely utilized of these methods is regularization, popularly known as the lasso. When the lasso is applied to high-dimensional data, observations are relatively few; thus, each observation can potentially have tremendous influence on model selection and inference. Hence, a natural question in this context is the identification and assessment of influential observations. We address this by extending the framework for assessing estimation influence in traditional linear regression, and demonstrate that it is equally, if not more, relevant for assessing model selection influence for high-dimensional lasso regression. Within this framework, we propose four new "deletion methods" for gauging the influence of an observation on lasso model selection: df-model, df-regpath, df-cvpath, and df-lambda. Asymptotic cut-offs for each measure, even when , are developed. We illustrate that in high-dimensional settings, individual observations can have a tremendous impact on lasso model selection. We demonstrate that application of our measures can help reveal relationships in high-dimensional real data that may otherwise remain hidden. for this article are available online.
机译:高维数据的可用性增加,以及“稀疏”解决方案的吸引力使得争议的似然方法常见。可以说是这些方法的最广泛利用是正规化的,普遍称为套索。当套索应用于高维数据时,观察结果相对较少;因此,每个观察可能对模型选择和推断产生巨大影响。因此,在这种背景下的自然问题是识别和评估有影响力的观察。我们通过扩展用于评估传统线性回归中的估计影响的框架来解决这一点,并且证明它同样是不是更多的,对于评估高维拉索回归的模型选择影响。在此框架内,我们提出了四种新的“删除方法”,用于测量对套索模型选择的观察的影响:DF-Model,DF-RegPath,DF-CVPath和DF-Lambda。每种措施的渐近截止值,即使在开发时也是如此。我们说明,在高维设置中,个人观察可能对套索模型选择产生巨大影响。我们证明我们的措施的应用可以帮助揭示可能否则隐藏的高维真实数据中的关系。本文可在线获取。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号