首页> 外文期刊>Electronic Journal of Statistics >Sketching meets random projection in the dual: A provable recovery algorithm for big and high-dimensional data
【24h】

Sketching meets random projection in the dual: A provable recovery algorithm for big and high-dimensional data

机译:草图在双重中遇到随机投影:大尺寸和高尺寸数据的可证明的恢复算法

获取原文
       

摘要

Sketching techniques scale up machine learning algorithms by reducing the sample size or dimensionality of massive data sets, without sacrificing their statistical properties. In this paper, we study sketching from an optimization point of view. We first show that the iterative Hessian sketch is an optimization process with preconditioning and develop an accelerated version using this insight together with conjugate gradient descent. Next, we establish a primal-dual connection between the Hessian sketch and dual random projection, which allows us to develop an accelerated iterative dual random projection method by applying the preconditioned conjugate gradient descent on the dual problem. Finally, we tackle the problems of large sample size and high-dimensionality in massive data sets by developing the primal-dual sketch . The primal-dual sketch iteratively sketches the primal and dual formulations and requires only a logarithmic number of calls to solvers of small sub-problems to recover the optimum of the original problem up to arbitrary precision. Our iterative sketching techniques can also be applied for solving distributed optimization problems where data are partitioned by samples or features. Experiments on synthetic and real data sets complement our theoretical results.
机译:草图绘制技术通过减小海量数据集的样本大小或维数来扩大机器学习算法的规模,同时又不牺牲其统计特性。在本文中,我们从优化的角度研究素描。我们首先表明,迭代式Hessian草图是经过预处理的优化过程,并使用此见解和共轭梯度下降法开发了加速版本。接下来,我们在Hessian草图和对偶随机投影之间建立了原始对偶连接,这使我们能够通过对偶问题应用预处理的共轭梯度下降来开发加速迭代对偶随机投影方法。最后,我们通过开发原始对偶草图来解决海量数据集中样本量大和维数高的问题。原始对偶草图迭代地绘制原始和对偶公式,并且只需要对数次调用小子问题的求解器即可恢复原始问题的最优值,并达到任意精度。我们的迭代草图绘制技术也可以用于解决分布式优化问题,在这些问题中,数据是按样本或特征进行分区的。综合和真实数据集的实验补充了我们的理论结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号