【24h】

Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games

机译:强化学习在马氏游戏团队中发挥最佳纳什均衡

获取原文
获取原文并翻译 | 示例

摘要

Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exac-cerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm's parameters are easy to set to meet the convergence conditions.
机译:多主体学习是AI中的关键问题。在存在多个Nash均衡的情况下,即使是利益不冲突的代理商也可能无法学习最佳的协调策略。如果代理商不了解游戏并独立获得嘈杂的收益,那么这个问题就更加严重。因此,多主体强化学习涉及两个相互关联的问题:识别游戏和学习游戏。在本文中,我们提出了最优自适应学习,这是在任何团队马尔可夫博弈中以概率1收敛到最优Nash均衡的第一个算法。我们提供了收敛证明,并表明该算法的参数易于设置以满足收敛条件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号