首页> 外文会议>Conference on Neural Information Processing Systems >A Kernel Loss for Solving the Bellman Equation

【24h】

A Kernel Loss for Solving the Bellman Equation

机译：解决Bellman方程的内核损失

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Value function learning plays a central role in many state-of-the-art reinforcement-learning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variants of Bellman operator that are not necessarily a contraction. As a result, they may easily lose convergence guarantees, as can be observed in practice. In this paper, we propose a novel loss function, which can be optimized using standard gradient-based methods with guaranteed convergence. The key advantage is that its gradient can be easily approximated using sampled transitions, avoiding the need for double samples required by prior algorithms like residual gradient. Our approach may be combined with general function classes such as neural networks, using either on- or off-policy data, and is shown to work reliably and effectively in several benchmarks, including classic problems where standard algorithms are known to diverge.

机译：价值函数学习在许多最先进的加强学习算法中起着核心作用。许多流行的算法如Q-Learning，不优化任何客观函数，而是是贝尔曼运算符的一些变体的定点迭代，这不一定是收缩。因此，它们可能很容易失去收敛保证，可以在实践中观察到。在本文中，我们提出了一种新的损失功能，可以使用具有保证收敛的标准梯度的方法优化。关键优点是，使用采样的转变可以容易地近似，避免了现有算法等算法所需的双个样本的梯度。我们的方法可以使用可根据或脱离策略数据与神经网络等通用函数类（如神经网络）组合，并且被示出在几个基准中可靠且有效地工作，包括已知分歧的标准算法的经典问题。

著录项

来源
《Conference on Neural Information Processing Systems 》|2020年|p15071-15901|共12页
会议地点
作者
Yihao Feng; Lihong Li; Qiang Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计量学 ;
关键词

相似文献

外文文献
中文文献
专利

1. Solving Elliptic Hamilton-Jacobi-Bellman Equations in a Value Space [J] . Qiu Wenhao, Song Qingshuo, Yin George IEEE Control Systems Letters . 2021 ,第1期

机译：在价值空间中求解椭圆哈密尔顿 - jacobi-bellman方程
2. SOLVING A CLASS OF HAMILTON-JACOBI-BELLMAN EQUATIONS USING PSEUDOSPECTRAL METHODS [J] . Mehrali-Varjani Mohsen, Shamsi M., Malek Alaeddin Kybernetika . 2018 ,第4期

机译：用伪谱方法求解一类Hamilton-Jacobi-Bellman方程
3. Solving a class of Hamilton-Jacobi-Bellman equations using pseudospectral methods [J] . Mohsen Mehrali-Varjani, Mostafa Shamsi, Alaeddin Malek Kybernetika . 2018 ,第4期

机译：使用伪谱方法求解一类Hamilton-Jacobi-Bellman方程
4. A Kernel Loss for Solving the Bellman Equation [C] . Yihao Feng, Lihong Li, Qiang Liu Conference on Neural Information Processing Systems . 2020

机译：解决Bellman方程的内核损失
5. Geometric aspects of exact solutions of bellman equations of harmonic analysis problems. [D] . Ivanisvili, Paata. 2015

机译：谐波分析问题的Bellman方程精确解的几何方面。
6. Forward and Backward Bellman Equations Improve the Efficiency of the EM Algorithm for DEC-POMDP [O] . Takehiro Tottori, Tetsuya J. Kobayashi 2021

机译：向前和后退Bellman方程提高了DEC-POMDP的EM算法的效率
7. From a monotone probabilistic scheme to a probabilistic max-plus algorithm for solving Hamilton-Jacobi-Bellman equations [O] . Akian, Marianne, Fodjo, Eric 2017

机译：从单调概率方案到概率max-plus 求解Hamilton-Jacobi-Bellman方程的算法
8. Bellman-Krein Formula for an Integral Equation with Kernel of the Type K(X,Y)=K(X - Y) X- Y Sup(- Alpha ) [R] . Youssef, M. Y. A., El Walik, S. A. 1976

机译：具有K（X，Y）核的积分方程的Bellman-Kerin公式= K（X - Y）X-Y sup（ - alpha）

A Kernel Loss for Solving the Bellman Equation

摘要

著录项

相似文献

相关主题

期刊订阅