...
首页> 外文期刊>Journal of machine learning research >Nash Q-Learning for General-Sum Stochastic Games
【24h】

Nash Q-Learning for General-Sum Stochastic Games

机译:Nash Q-学习常规和随机游戏

获取原文
   

获取外文期刊封面封底 >>

       

摘要

We extend Q-learning to a noncooperative multiagent context, using theframework of general-sum stochastic games. A learning agent maintainsQ-functions over joint actions, and performs updates based on assumingNash equilibrium behavior over the current Q-values. This learningprotocol provably converges given certain restrictions on the stagegames (defined by Q-values) that arise during learning. Experiments with a pair of two-playergrid games suggest that such restrictions on the game structure arenot necessarily required. Stage games encountered during learning in both grid environments violate the conditions.However, learningconsistently converges in the first grid game, which has a uniqueequilibrium Q-function, but sometimes fails to converge in thesecond, which has three different equilibrium Q-functions.In a comparison of offline learning performance inboth games, we find agents are more likely to reach a joint optimalpath with Nash Q-learning than with a single-agent Q-learningmethod. When at least one agent adopts Nash Q-learning,the performance of both agents is better than using single-agentQ-learning. We have also implemented an online version of NashQ-learning that balances exploration with exploitation,yielding improved performance. color="gray">
机译:我们使用一般和随机游戏的框架将Q学习扩展到非合作多主体环境。学习代理在关节动作上维持Q函数,并基于假设当前N值上的纳什均衡行为来执行更新。给定在学习过程中出现的阶段游戏(由Q值定义)的某些限制,该学习协议可以收敛。一对两人游戏的实验表明,对游戏结构的这种限制不是必需的。在两个网格环境中学习期间遇到的阶段游戏都违反了条件,但是,学习在第一个网格游戏中始终收敛,该游戏具有唯一的均衡Q函数,但有时在第二个网格游戏中却无法收敛,后者具有三个不同的均衡Q函数。比较两款游戏的离线学习表现,我们发现与使用单代理Q学习方法相比,使用Nash Q学习进行学习的代理商更有可能达成共同的最优路径。当至少一个代理采用Nash Q学习时,两种代理的性能均优于使用单个Agent Q学习。我们还实现了NashQ学习的在线版本,该版本使探索与开发保持平衡,并提高了性能。 color =“ gray”>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号