首页> 外文期刊>Journal of machine learning research >Lenient Learning in Independent-Learner Stochastic Cooperative Games
【24h】

Lenient Learning in Independent-Learner Stochastic Cooperative Games

机译:独立学习者随机合作游戏中的宽松学习

获取原文
           

摘要

We introduce the Lenient Multiagent Reinforcement Learning2 (LMRL2) algorithm for independent-learner stochasticcooperative games. LMRL2 is designed to overcome a pathologycalled relative overgeneralization, and to do so whilestill performing well in games with stochastic transitions,stochastic rewards, and miscoordination. We discuss the existingliterature, then compare LMRL2 against other algorithms drawnfrom the literature which can be used for games of this kind:traditional (a€?Distributeda€?) Q-learning, Hysteretic Q-learning,WoLF-PHC, SOoN, and (for repeated games only) FMQ. The resultsshow that LMRL2 is very effective in both of our measures(complete and correct policies), and is found in the top rankmore often than any other technique. LMRL2 is also easy to tune:though it has many available parameters, almost all of them stayat default settings. Generally the algorithm is optimally tunedwith a single parameter, if any. We then examine and discuss anumber of side-issues and options for LMRL2. color="gray">
机译:我们针对独立学习者随机合作博弈引入了 Lenient Multiagent Reinforcement Learning2 (LMRL2)算法。 LMRL2旨在克服称为“相对过度概括”的病理,并在具有随机过渡,随机奖励和配位不当的游戏中仍然表现良好。我们讨论了现有的文献,然后将LMRL2与从文献中得出的可用于此类游戏的其他算法进行比较:传统(a)?分布式(Q)学习,滞后Q学习,WoLF-PHC,SOoN和(仅适用于重复游戏)FMQ。结果表明,LMRL2在我们的两种措施(完整和正确的策略)中都非常有效,并且在排名最高的位置上比其他任何一种技术都更频繁。 LMRL2也很容易调整:尽管它具有许多可用参数,但几乎所有参数都保持默认设置。通常,该算法可以通过单个参数(如果有)进行优化。然后,我们检查并讨论LMRL2的许多附带问题和选项。 color =“ gray”>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号