首页> 美国卫生研究院文献>Proceedings of the National Academy of Sciences of the United States of America >PNAS Plus: Optimal errors and phase transitions in high-dimensional generalized linear models
【2h】

PNAS Plus: Optimal errors and phase transitions in high-dimensional generalized linear models

机译:PNAS Plus:高维广义线性模型中的最佳误差和相变

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.
机译:广义线性模型(GLM)用于高维机器学习,统计,通信和信号处理。在本文中,我们分析了数据矩阵随机时的GLM,这些问题与诸如压缩感知,纠错码或神经网络中的基准模型等问题相关。我们评估相互信息(或“自由熵”),从中我们得出贝叶斯最优估计和泛化误差。我们的分析适用于样本数量和维度都大且比率固定的高维限制。在基于所谓的复制方法的统计物理学领域中,对于GLM的特殊情况,例如对于感知器,存在对最优误差的不严格的预测。我们严格地建立了那些已有数十年历史的猜想,并根据广义近似消息传递算法的性能提出了它们的算法解释。此外,对于许多学习问题,我们严格地表征了该算法可实现最佳性能的参数区域,并确定了将可学习区域与不可学习区域分开的相关尖锐相变。我们认为,这种GLM的随机版本可以作为多用途算法的具有挑战性的基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号