Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

Julien Perolat; Florian Strub; Bilal Piot; Olivier Pietquin

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

【24h】

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

机译：从批处理数据学习通用和马尔可夫博弈的纳什均衡

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the problem of learning a Nash equilibrium in $γ$-discounted multiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although, several techniques were found for the subcase of two-player zero-sum MGs, those techniques fail to find a Nash equilibrium in general-sum Markov Games. In this paper, we introduce a new definition of $ε$-Nash equilibrium in MGs which grasps the strategy’s quality for multiplayer games. We prove that minimizing the norm of two Bellman-like residuals implies to learn such an $ε$-Nash equilibrium. Then, we show that minimizing an empirical estimate of the $L_p$ norm of these Bellman-like residuals allows learning for general-sum games within the batch setting. Finally, we introduce a neural network architecture that successfully learns a Nash equilibrium in generic multiplayer general-sum turn-based MGs.

机译：本文解决了在成批设置的情况下在折价$γ$的多人通用和马尔可夫游戏（MGs）中学习纳什均衡的问题。随着MG中玩家人数的增加，代理商可以合作或组队以增加最终奖励。解决此问题的一种方法是寻找纳什均衡。尽管针对两人零和MG的子情况发现了几种技术，但这些技术未能在一般和式马尔可夫博弈中找到纳什均衡。在本文中，我们引入了MG中$ε$-纳什均衡的新定义，该定义掌握了多人游戏策略的质量。我们证明，最小化两个类似于Bellman的残差的范数意味着要学习这样的$ε$ -Nash平衡。然后，我们表明，最小化这些类似于Bellman的残差的$ L_p $范数的经验估计，可以学习批处理范围内的一般和游戏。最后，我们介绍了一种神经网络体系结构，该体系结构成功地学习了基于通用多人通用和型回合的MG中的纳什均衡。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2017年第2009期|共10页
作者
Julien Perolat; Florian Strub; Bilal Piot; Olivier Pietquin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Nash Q-Learning for General-Sum Stochastic Games [J] . Hu Junling, Wellman Michael P. Journal of machine learning research . 2003,第Nov期

机译：Nash Q-学习常规和随机游戏
2. Sufficient conditions for Nash equilibrium point in the linear quadratic game for Markov jump positive systems [J] . Vasile Dragan, Ivan G. Ivanov Control Theory & Applications, IET . 2017,第15期

机译：Markov跳跃正系统线性二次博弈中Nash平衡点的充分条件。
3. Computing the strong L_p- Nash equilibrium for Markov chains games: Convergence and uniqueness [J] . Kristal K. Trejo, Julio B. Clempner, Alexander S. Poznyak Applied Mathematical Modelling . 2017,第Jana期

机译：计算马尔可夫链博弈的强大L_p- Nash均衡：收敛性和唯一性
4. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games [C] . Xiaofeng Wang, Tuomas Sandholm Annual neural information processing systems conference . 2003

机译：加强学习在马尔可夫游戏团队中发挥最佳的纳什均衡
5. Learning in games: An experimental analysis of how learning takes place in games with stable and unstable Nash equilibria. [D] . Luyendyk, Megan. 2004

机译：游戏中的学习：对具有稳定和不稳定纳什均衡的游戏中学习如何进行的实验分析。
6. Nash Equilibrium of Social-Learning Agents in a Restless Multiarmed Bandit Game [O] . Kazuaki Nakayama, Masato Hisakado, Shintaro Mori -1

机译：躁动多臂强盗游戏中的社会学习代理人的纳什均衡
7. Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games [O] . Prasad, H. L, Prashanth, L. A., Bhatnagar, Shalabh 2015

机译：N-player中学习纳什均衡的演员批评算法一般和游戏

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

摘要

著录项

相似文献

相关主题

期刊订阅