首页> 外文期刊>Journal of machine learning research >A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
【24h】

A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization

机译:基于分布鲁棒优化的回归模型的强大学习方法

获取原文
           

摘要

We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973).
机译:当观察到的样本可能被离前星体损坏的异常值污染时,我们介绍了一种分布的稳健优化(DRO)方法来估计线性回归设置中的强烈回归平面。我们的方法通过对观察数据的概率分布对冲进行对冲来减轻异常值的影响,其中一些是对异常值分配了非常低的概率。所考虑的一组分布在Wassersein公制的意义上接近实证分布。我们表明,这种DRO配方可以放宽到凸优化问题,包括一类模型。通过为Wassersein度量选择适当的常规空间,我们能够恢复几种常用的正则化回归模型。我们向正规化术语提供新的见解,并从信心地区的角度下提供正则化系数的选择。我们在温和条件下建立两种类型的性能保障,以解决我们的配方。一个与其超出样本行为(预测偏差)有关,另一个涉及估计和真实回归平面之间的差异(估计偏见)。在预测和估计准确性方面,广泛的数值结果证明了我们对许多回归模型的方法的优越性。我们还考虑将我们的强大学习程序应用于异常检测,并显示我们的方法比M估计(Huber,1964,1973)实现了更高的AUC(ROC曲线下的区域)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号