首页> 外文会议>Annual conference on Neural Information Processing Systems >Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy
【24h】

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy

机译:经验牛顿统计最小化的自适应牛顿法

获取原文

摘要

We consider empirical risk minimization for large-scale datasets. We introduce Ada Newton as an adaptive algorithm that uses Newton's method with adaptive sample sizes. The main idea of Ada Newton is to increase the size of the training set by a factor larger than one in a way that the minimization variable for the current training set is in the local neighborhood of the optimal argument of the next training set. This allows to exploit the quadratic convergence property of Newton's method and reach the statistical accuracy of each training set with only one iteration of Newton's method. We show theoretically that we can iteratively increase the sample size while applying single Newton iterations without line search and staying within the statistical accuracy of the regularized empirical risk. In particular, we can double the size of the training set in each iteration when the number of samples is sufficiently large. Numerical experiments on various datasets confirm the possibility of increasing the sample size by factor 2 at each iteration which implies that Ada Newton achieves the statistical accuracy of the full training set with about two passes over the dataset.
机译:我们考虑对大型数据集进行经验风险最小化。我们介绍Ada Newton作为一种自适应算法,该算法使用具有自适应样本大小的Newton方法。 Ada Newton的主要思想是将训练集的大小增加一个因数,从而使当前训练集的最小变量位于下一个训练集的最佳自变量的局部附近。这样就可以利用牛顿方法的二次收敛性,并且只需一次牛顿方法的迭代就可以达到每个训练集的统计精度。从理论上讲,我们可以在不进行线搜索的情况下应用单个Newton迭代来迭代地增加样本大小,并且不超出正则化经验风险的统计准确性。特别是,当样本数量足够大时,我们可以在每次迭代中将训练集的大小加倍。在各种数据集上的数值实验证实了每次迭代将样本量增加2倍的可能性,这意味着Ada Newton在数据集上进行了大约两次遍历即可达到完整训练集的统计准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号