首页> 美国政府科技报告 >Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning.
【24h】

Nonconvergence to Saddle Boundary Points under Perturbed Reinforcement Learning.

机译:扰动强化学习下鞍边界点的非收敛性。

获取原文

摘要

This paper presents a novel reinforcement learning algorithm and provides conditions for global convergence to Nash equilibria. For several classes of reinforcement learning schemes, including the ones proposed here, excluding convergence to action profiles which are not Nash equilibria may not be trivial, unless the step-size sequence is appropriately tailored to the specifics of the game. In this paper we sidestep these issues by introducing a perturbed reinforcement learning scheme where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function. Contrary to prior work on equilibrium selection in games where perturbation functions are globally state dependent, the perturbation function here is assumed to be local, i.e., it only depends on the strategy of each agent. We provide conditions under which the strategies of the agents will converge to an arbitrarily small neighborhood of the set of Nash equilibria almost surely. This extends prior analysis on reinforcement learning in games which has been primarily focused on urn processes. We finally specialize the results to a class of potential games.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号