首页> 外文会议>2012 IEEE International Symposium on Information Theory Proceedings >Corrupted and missing predictors: Minimax bounds for high-dimensional linear regression
【24h】

Corrupted and missing predictors: Minimax bounds for high-dimensional linear regression

机译:预测变量已损坏和缺失:高维线性回归的Minimax界

获取原文
获取原文并翻译 | 示例

摘要

Missing and corrupted data are ubiquitous in many science and engineering domains. We analyze the information-theoretic limits of recovering sparse vectors under various models of corrupted and missing data. In particular, consider a high-dimensional linear regression model y = X β + є, where y Є Rn is the response vector, X Є RnXp is a random design matrix with p ≫ n and rows distributed i.i.d. as Ν(0, Σx), β є Rp is the unknown regression vector, and є ∼ Ν(0,σ2єΙ) is independent additive noise. Whereas a traditional approach assumes that the covariates X are fully observed, we assume only that a corrupted version Z is observed. Our main contribution is to establish minimax rates of convergence for estimating β in squared ℓ2-loss, assuming β is k-sparse. Our upper and lower bounds in both additive noise and missing data cases scale as k log(p/k), with prefactors depending only on the corruption and/or missing pattern of the data.
机译:丢失和损坏的数据在许多科学和工程领域无处不在。我们分析了在损坏和丢失数据的各种模型下恢复稀疏向量的信息理论极限。特别是,考虑一个高维线性回归模型y = Xβ +є,其中yЄR n 是响应向量XЄR nXp 是一个随机设计矩阵,其中p≫ n和行分布为iid由于Ν(0,Σx),βєR p 是未知回归向量,є〜Ν(0,σ 2 єΙ )是独立的加性噪声​​。传统方法假定已完全观察到协变量X,而我们仅假定观察到了损坏的版本Z。假设β是k稀疏的,我们的主要贡献是建立最小最大收敛速率来估计平方ℓ2-损失中的β。我们在加性噪声和丢失数据情况下的上限和下限按k log(p / k)/ n缩放,其前置因子仅取决于数据的损坏和/或丢失模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号