Robust Regression via Hard Thresholding

机译：通过硬阈值进行稳健回归

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the problem of Robust Least Squares Regression (RLSR) where several response variables can be adversarially corrupted. More specifically, for a data matrix X ∈ R~(p×n) and an underlying model w~*, the response vector is generated as y = X~Tw~* + b where b ∈ R~n is the corruption vector supported over at most C • n coordinates. Existing exact recovery results for RLSR focus solely on L_1-penalty based convex formulations and impose relatively strict model assumptions such as requiring the corruptions b to be selected independently of X. In this work, we study a simple hard-thresholding algorithm called Torrent which, under mild conditions on X, can recover w~* exactly even if b corrupts the response variables in an adversarial manner, i.e. both the support and entries of b are selected adversarially after observing X and w~*. Our results hold under deterministic assumptions which are satisfied if X is sampled from any sub-Gaussian distribution. Finally unlike existing results that apply only to a fixed w~*, generated independently of X, our results are universal and hold for any w~* ∈ R~p. Next, we propose gradient descent-based extensions of Torrent that can scale efficiently to large scale problems, such as high dimensional sparse recovery, and prove similar recovery guarantees for these extensions. Empirically we find TORRENT, and more so its extensions, offering significantly faster recovery than the state-of-the-art L_1 solvers. For instance, even on moderate-sized datasets (with p = 50K) with around 40% corrupted responses, a variant of our proposed method called TORRENT-HYB is more than 20× faster than the best L_1 solver.

机译：我们研究了鲁棒最小二乘回归（RLSR）的问题，其中几个响应变量可能会受到对抗性破坏。更具体地说，对于数据矩阵X∈R〜（p×n）和基础模型w〜*，响应向量生成为y = X〜Tw〜* + b，其中b∈R〜n是受支持的破坏向量最多C•n个坐标。 RLSR的现有精确恢复结果仅着眼于基于L_1-惩罚的凸公式，并施加了相对严格的模型假设，例如要求独立于X选择损坏b。在这项工作中，我们研究了一种称为Torrent的简单硬阈值算法，在X的温和条件下，即使b以对抗的方式破坏了响应变量，也可以精确地恢复w〜*，即在观察X和w〜*之后，对抗性地选择b的支持和输入。我们的结果是在确定性假设下得出的，如果从任何次高斯分布中采样X，则可以满足。最后，与仅适用于固定的w〜*且独立于X生成的现有结果不同，我们的结果是通用的，并且适用于任何w〜*∈R〜p。接下来，我们提出基于Torrent梯度下降的扩展，可以有效地扩展到诸如高维稀疏恢复之类的大规模问题，并证明这些扩展具有类似的恢复保证。从经验上讲，我们发现TORRENT以及它的扩展更能提供比最新的L_1求解器更快的恢复速度。例如，即使在响应大小大约为40％的中等大小的数据集（p = 50K）上，我们提出的方法的一种称为TORRENT-HYB的变体也比最佳L_1求解器快20倍以上。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2015年|721-729|共9页
会议地点
作者
Kush Bhatia; Prateek Jain; Purushottam Kar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Robust Regression via Heuristic Corruption Thresholding and Its Adaptive Estimation Variation [J] . Zhang Xuchao, Lei Shuo, Zhao Liang, ACM transactions on knowledge discovery from data . 2019,第3期

机译：基于启发式腐败阈值的稳健回归及其自适应估计变异
2. Robust inference for threshold regression models [J] . Hidalgo Javier, Lee Jungyoon, Seo Myung Hwan Journal of Econometrics . 2019,第2期

机译：阈值回归模型的强大推断
3. Model-robust inference for continuous threshold regression models [J] . Fong Youyi, Di Chongzhi, Huang Ying, Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2017,第2期

机译：用于连续阈值回归模型的模型稳健推断
4. Robust Regression via Heuristic Hard Thresholding [C] . Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, International Joint Conference on Artificial Intelligence . 2019

机译：通过启发式硬阈值的强大回归
5. Parameter non-constancy in linear regression: Applications of threshold regression and structural change models. [D] . Adachi, Kenji. 2010

机译：线性回归中的参数非恒定性：阈值回归和结构变化模型的应用。
6. Model-robust Inference for Continuous Threshold Regression Models [O] . Youyi Fong, Chongzhi Di, Ying Huang, -1

机译：连续阈值回归模型的模型鲁棒性推断
7. Robust Estimation and Inference for Threshold Models with Integrated Regressors [O] . Chen, Haiqiang 2013

机译：集成回归器阈值模型的鲁棒估计与推断

Robust Regression via Hard Thresholding

摘要

著录项

相似文献

相关主题

期刊订阅