首页> 外文学位 >Multiple Optimality Guarantees in Statistical Learning.
【24h】

Multiple Optimality Guarantees in Statistical Learning.

机译:统计学习中的多重最优性保证。

获取原文
获取原文并翻译 | 示例

摘要

Classically, the performance of estimators in statistical learning problems is measured in terms of their predictive ability or estimation error as the sample size n grows. In modern statistical and machine learning applications, however, computer scientists, statisticians, and analysts have a variety of additional criteria they must balance: estimators must be efficiently computable, data providers may wish to maintain anonymity, large datasets must be stored and accessed. In this thesis, we consider the fundamental questions that arise when trading between multiple such criteria---computation, communication, privacy---while maintaining statistical performance. Can we develop lower bounds that show there must be tradeoffs? Can we develop new procedures that are both theoretically optimal and practically useful?;To answer these questions, we explore examples from optimization, confidentiality preserving statistical inference, and distributed estimation under communication constraints. Viewing our examples through a general lens of constrained minimax theory, we prove fundamental lower bounds on the statistical performance of any algorithm subject to the constraints---computational, confidentiality, or communication---specified. These lower bounds allow us to guarantee the optimality of the new algorithms we develop addressing the additional criteria we consider, and additionally, we show some of the practical benefits that a focus on multiple optimality criteria brings.;In somewhat more detail, the central contributions of this thesis include the following: we • develop several new stochastic optimization algorithms, applicable to general classes of stochastic convex optimization problems, including methods that are automatically adaptive to the structure of the underlying problem, parallelize naturally to attain linear speedup in the number of processors available, and may be used asynchronously, • prove lower bounds demonstrating the optimality of these methods, • provide a variety of information-theoretic tools---strong data processing inequalities---useful for proving lower bounds in privacy-preserving statistical inference, communication-constrained estimation, and optimization, • develop new algorithms for private learning and estimation, guaranteeing their optimality, and • give simple distributed estimation algorithms and prove fundamental limits showing that they (nearly) optimally trade off between communication (in terms of the number of bits distributed processors may send) and statistical risk.
机译:传统上,随着样本量n的增长,估计器在统计学习问题中的表现是根据其预测能力或估计误差来衡量的。但是,在现代统计和机器学习应用程序中,计算机科学家,统计人员和分析人员还必须平衡各种其他标准:估算器必须有效地计算,数据提供者可能希望保持匿名,必须存储和访问大型数据集。在本文中,我们考虑了在维护统计性能的同时,在多个这样的标准(计算,通信,隐私)之间进行交易时出现的基本问题。我们是否可以制定表明必须权衡的下限?我们能否开发出理论上最佳且实用的新程序?;为了回答这些问题,我们将探讨一些示例,包括优化,在通信约束下保持统计推断的机密性和分布式估计。通过受约束的极小极大值理论的一般眼光来看我们的示例,我们证明了受算法(计算,机密性或通信)约束的任何算法的统计性能的基本下限。这些下限使我们能够保证针对所考虑的其他标准开发的新算法的最优性,此外,我们还展示了关注多个最优性标准所带来的一些实际好处。本论文的内容包括以下内容:•我们开发了几种新的随机优化算法,适用于一般类别的随机凸优化问题,包括自动适应基础问题的结构的方法,可以自然并行化以实现线性加速。可用的处理器,并且可以异步使用;•证明下限证明了这些方法的最优性;•提供了各种信息理论工具-强烈的数据处理不等式-可用于证明保护隐私的统计推断的下限,通信受限的估计和优化,•开发新算法用于私有学习和估计,保证其最优性,并给出简单的分布式估计算法,并证明基本限制,表明它们(几乎)在通信(就分布式处理器可能发送的位数而言)和统计风险之间进行了最佳权衡。

著录项

  • 作者

    Duchi, John C.;

  • 作者单位

    University of California, Berkeley.;

  • 授予单位 University of California, Berkeley.;
  • 学科 Computer science.;Statistics.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 266 p.
  • 总页数 266
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号