首页> 美国政府科技报告 >Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning.

【24h】

Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning.

机译：扰动强化学习下鞍边界点的非收敛性。

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents a novel reinforcement learning algorithm and provides conditions for global convergence to Nash equilibria. For several classes of reinforcement learning schemes, including the ones proposed here, excluding convergence to action profiles which are not Nash equilibria may not be trivial, unless the step-size sequence is appropriately tailored to the specifics of the game. In this paper we sidestep these issues by introducing a perturbed reinforcement learning scheme where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function. Contrary to prior work on equilibrium selection in games where perturbation functions are globally state dependent, the perturbation function here is assumed to be local, i.e., it only depends on the strategy of each agent. We provide conditions under which the strategies of the agents will converge to an arbitrarily small neighborhood of the set of Nash equilibria almost surely. This extends prior analysis on reinforcement learning in games which has been primarily focused on urn processes. We finally specialize the results to a class of potential games.

著录项

作者
Chasparis, G. C.; Shamma, J. S.; Rantzer, A.;
展开▼
作者单位

展开▼
年度 2012
页码 1-31
总页数 31
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Game theory; Multiagent systems; Strategy; Reinforcement learning; Nash equilibria;

机译：博弈论;多智能体系统;策略;强化学习;纳什均衡;

相似文献

外文文献
中文文献
专利

1. Nonconvergence to saddle boundary points under perturbed reinforcementn learning [J] . Chasparis Georgios C., Shamma Jeff S., Rantzer Anders International Journal of Game Theory . 2015,第3期

机译：扰动强化学习下鞍边界点的非收敛性
2. Wada boundary bifurcations induced by boundary saddle-saddle collision [J] . Liu Xiao-Ming, Jiang Jun, Hong Ling, Physics Letters, A . 2019,第2a3期

机译：边界马鞍碰撞诱导的Wada边界分岔
3. The ubiquity of model-based reinforcement learning. [J] . Bradley B Doll, Dylan A Simon, Nathaniel D Daw Current Opinion in Neurobiology . 2012,第6期

机译：基于模型的强化学习无处不在。
4. Multi-Target Trajectory Optimization with Neural Network and Reinforcement Learning. [C] . Haiyang Li, Zhemin Chi, Hexi Baoyin International Astronautical Congress . 2019

机译：具有神经网络和强化学习的多目标轨迹优化。
5. Numerical methods for singularly perturbed boundary value problems and singularly perturbed equations. [D] . Savin, Igor. 2010

机译：奇摄动边值问题和奇摄动方程的数值方法。
6. Frequency of reinforcement as a determinant of extinction-induced aggression during errorless discrimination learning. [O] . M Rilling, H J Caplan 1975

机译：强化的频率作为无误判别学习过程中灭绝诱发的攻击行为的决定因素。
7. Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning ∗ [O] . 2012

机译：扰动强化学习中鞍边界点的非收敛性*

Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅