A Sharper Generalization Bound for Divide-and-Conquer Ridge Regression

机译：一个较小的泛化界限，用于分割和征收山脊回归

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the distributed machine learning problem where the n feature-response pairs are partitioned among m machines uniformly at random. The goal is to approximately solve an empirical risk minimization (ERM) problem with the minimum amount of communication. The divide-and-conquer (DC) method, which was proposed several years ago, lets every worker machine independently solve the same ERM problem using its local feature-response pairs and the driver machine combine the solutions. This approach is in one-shot and thereby extremely communication-efficient. Although the DC method has been studied by many prior works, reasonable generalization bound has not been established before this work. For the ridge regression problem, we show that the prediction error of the DC method on unseen test samples is at most ε times larger than the optimal. There have been constant-factor bounds in the prior works, their sample complexities have a quadratic dependence on d, which does not match the setting of most real-world problems. In contrast, our bounds are much stronger. First, our 1 + ε error bound is much better than their constant-factor bounds. Second, our sample complexity is merely linear with d.

机译：我们研究了分布式机器学习问题，其中n个功能响应对在随机均匀地在M机器中划分。目标是大致解决了具有最小通信量的经验风险最小化（ERM）问题。几年前提议的分派和征管（DC）方法，让每个工人机器使用其本地特征响应对和驱动机器组合解决方案来独立解决相同的ERM问题。这种方法是一次性的，从而极其沟通效率。虽然已经通过许多事先作用研究了DC方法，但在这项工作之前尚未建立合理的泛化界定。对于Ridge回归问题，我们表明DC方法对看不见的试样上的预测误差是大部分大于最佳的ε倍。先前作品中存在恒定因子界限，它们的样本复杂性对D具有二次依赖性，这与大多数现实世界问题的设置不符。相比之下，我们的界限更强大。首先，我们的1 +ε错误绑定比其恒因因子界限好得多。其次，我们的样本复杂性仅用D线性。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2019年|4983-5457p|共8页
会议地点
作者
Shusen Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Distributed Generalized Cross-Validation for Divide-and-Conquer Kernel Ridge Regression and Its Asymptotic Optimality [J] . Xu Ganggang, Shang Zuofeng, Cheng Guang Journal of computational and graphical statistics: A joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America . 2019,第4期

机译：分布式广义交叉验证，用于征收内核回归及其渐近最优性
2. Optimal Tuning for Divide-and-conquer Kernel Ridge Regression with Massive Data [J] . Ganggang Xu, Zuofeng Shang, Guang Cheng JMLR: Workshop and Conference Proceedings . 2018,第3期

机译：具有海量数据的分而治之内核岭回归的优化调整
3. Differential privacy and generalization: Sharper bounds with applications [J] . Oneto Luca, Ridella Sandro, Anguita Davide Pattern recognition letters . 2017,第Apra1期

机译：差异性隐私和泛化：应用程序的界限更广
4. A Sharper Generalization Bound for Divide-and-Conquer Ridge Regression [C] . Shusen Wang AAAI Conference on Artificial Intelligence . 2019

机译：一个较小的泛化界限，用于分割和征收山脊回归
5. A COMPARISON OF SIX MODELS FOR PREDICTING CORPORATE BANKRUPTCY: MULTIPLE LINEAR REGRESSION ANALYSIS, MULTIPLE LINEAR DISCRIMINANT ANALYSIS, STEPWISE REGRESSION ANALYSIS, STEPWISE DISCRIMINANT ANALYSIS, MULTIPLE LINEAR REGRESSION ANALYSIS WITH RIDGE REGRESSION, AND MULTIPLE LINEAR DISCRIMINANT ANALYSIS WITH BIASED MINIMUM CHI-SQUARE RULE [D] . MAPP, JOHNNIE ALBERT. 1981

机译：六种预测公司破产的模型的比较：多个线性回归分析，多个线性判别分析，逐步回归分析，逐步判别分析，多个带岭点回归的线性回归分析，以及多个线性离散
6. Fractional ridge regression: a fast interpretable reparameterization of ridge regression [O] . Ariel Rokem, Kendrick Kay 2020

机译：分数山脊回归：脊回归的快速可解释的回归
7. Singular ridge regression with homoscedastic residuals: generalization error with estimated parameters [O] . Grigoryeva, Lyudmila, Ortega, Juan-Pablo 2016

机译：具有同方差残差的奇异脊回归：推广估计参数的错误
8. SHARPER BOUNDS FOR THE CEBYSHEV-FUNCTIONS θ(X)ψ(X) [R] . J. Barkley Rosser, Lowell Schoenteld 1974

机译：CEBYsHEV函数的sHaRpER界限θ（X）ψ（X）

A Sharper Generalization Bound for Divide-and-Conquer Ridge Regression

摘要

著录项

相似文献

相关主题

期刊订阅