Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

Belinda Tzen; Tengyuan Liang; Maxim Raginsky

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

【24h】

Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

机译：通过经验衡量性的Langevin算法的本地最优性和泛化保障

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the detailed path-wise behavior of the discrete-time Langevin algorithm for non-convex Empirical Risk Minimization (ERM) through the lens of metastability, adopting some techniques from Berglund and Gentz (2003). For a particular local optimum of the empirical risk, with an extit{arbitrary initialization}, we show that, with high probability, at least one of the following two events will occur: (1) the Langevin trajectory ends up somewhere outside the $arepsilon$-neighborhood of this particular optimum within a short extit{recurrence time}; (2) it enters this $arepsilon$-neighborhood by the recurrence time and stays there until a potentially exponentially long extit{escape time}. We call this phenomenon extit{empirical metastability}. This two-timescale characterization aligns nicely with the existing literature in the following two senses. First, the effective recurrence time (i.e., number of iterations multiplied by stepsize) is dimension-independent, and resembles the convergence time of continuous-time deterministic Gradient Descent (GD). However unlike GD, the Langevin algorithm does not require strong conditions on local initialization, and has the possibility of eventually visiting all optima. Second, the scaling of the escape time is consistent with the Eyring-Kramers law, which states that the Langevin scheme will eventually visit all local minima, but it will take an exponentially long time to transit among them. We apply this path-wise concentration result in the context of statistical learning to examine local notions of generalization and optimality.

机译：我们研究了非凸透态风险最小化（ERM）的离散时间Langevin算法的详细路径，通过卷膜镜片，采用来自Berglund和Gentz（2003）的一些技术。对于特定本地最佳的经验风险，具有 Texit {任意初始化}，我们表明，具有很高的概率，将发生以下两个事件中的至少一个：（1）Langevin轨迹最终在$外的某处 varepsilon $ - 在短暂的 exingit {复发时间}内的这个特别是最佳的。（2）它通过复发时间进入这款$ varepsilon $ -neighborhood并留在那里，直到潜在的令人指重的长 extyit {effect时间}。我们称之为这种现象 Textit {经验迁移性}。这两个时间尺度表征在以下两个感官中与现有文献恰当地对齐。首先，有效的复发时间（即迭代的数量乘以步骤）是尺寸无关的，并且类似于连续时间确定梯度下降（GD）的收敛时间。然而，与GD不同，LangeVin算法不需要对本地初始化的强大条件，并且可能最终访问所有最佳才能。其次，逃生时间的缩放与eycing-kramers法则一致，这使得Langevin计划最终将访问所有当地的最小值，但它将在他们之间进行指数率很长时间。我们在统计学习的背景下应用此路径明智的浓度，以检查泛型概念和最优性的情况。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共19页
作者
Belinda Tzen; Tengyuan Liang; Maxim Raginsky;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability [J] . Belinda Tzen, Tengyuan Liang, Maxim Raginsky JMLR: Workshop and Conference Proceedings . 2017,第2期

机译：基于经验亚稳定性的Langevin算法的局部最优性和泛化保证
2. Interior point algorithms: guaranteed optimality for fluence map optimization in IMRT. [J] . Aleman DM, Glaser D, Romeijn HE, Physics in medicine and biology. . 2010,第18期

机译：内部点算法：确保IMRT中的通量图优化的最优性。
3. Single-track train timetabling with guaranteed optimality: Branch-and-bound algorithms with enhanced lower bounds [J] . Xuesong Zhou, Ming Zhong Transportation Research Part B: Methodological . 2007,第3期

机译：保证最优性的单轨列车时间表：具有增强的下界的分支定界算法
4. On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm [C] . Borja Rodríguez-Gálvez, Germán Bassi, Ragnar Thobaben, IEEE Information Theory Workshop . 2021

机译：在随机子集概述误差与随机梯度Langevin动力学算法上
5. Optimal control design for polynomial nonlinear systems using sum of squares technique with guaranteed local optimality. [D] . Boonnithivorakul, Nattapong. 2010

机译：多项式非线性系统的最优控制设计，采用平方和技术，保证局部最优。
6. Global and Local Optimization Algorithms for Optimal Signal Set Design [O] . Anthony J. Kearsley 2001

机译：最佳信号集设计的全局和局部优化算法
7. On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm [O] . Borja Rodriguez-Galvez, German Bassi, Ragnar Thobaben, 2021

机译：随机子集泛化误差界限与随机梯度兰富文动力学算法

Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

摘要

著录项

相似文献

相关主题

期刊订阅