Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

机译：两位玩家零和游戏中的代理理性的大规模学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions. Although a recent work provides a powerful differentiable end-to-end learning frameworks which embed a game solver within a deep-learning framework, allowing unknown game parameters to be learned via backpropagation, this framework faces significant limitations when applied to boundedly rational human agents and large scale problems, leading to poor practicality. In this paper, we address these limitations and propose a framework that is applicable for more practical settings. First, seeking to learn the rationality of human agents in complex two-player zero-sum games, we draw upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning. Second, to scale up to large, real-world scenarios, we propose an efficient first-order primal-dual method which exploits the structure of extensive-form games, yielding significantly faster computation for both game solving and gradient computation. When tested on randomly generated games, we report speedups of orders of magnitude over previous approaches. We also demonstrate the effectiveness of our model on both real-world one-player settings and synthetic data.

机译：随着近期解决大型零和广泛形式的游戏的进步，对推断底层游戏参数的逆问题越来越感兴趣，只有仅权访问代理操作。虽然最近的工作提供了一个强大的可分别的端到端学习框架，但在深度学习的框架内嵌入游戏解决者，允许通过BackPropagation学习未知的游戏参数，此框架在应用于界限的理性人员时面临重大限制大规模问题，导致实用性差。在本文中，我们解决了这些限制，并提出了一个适用于更实际设置的框架。首先，寻求学习在复杂的双球员零和游戏中人体代理的合理性，我们在决策理论中汲取知名的想法，以获得简明和可解释的代理行为模型，并导出终端的求解器和梯度 - 结束学习。其次，扩大到大型，现实世界的情景，我们提出了一种高效的一阶原始方法，利用广泛形式的游戏结构，从而产生了对游戏解决和梯度计算的速度明显更快。在随机生成的游戏上进行测试时，我们向以前的方法报告数量级的加速。我们还展示了我们在真实世界一批播放器设置和合成数据上的模型的有效性。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2019年|5458-6186p|共8页
会议地点
作者
Chun Kai Ling; Fei Fang; J. Zico Kolter;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Learning nonlinear robust control as a data-driven zero-sum two-player game for an active suspension system [J] . Mircea-Bogdan Radac, Timotei Lala IFAC PapersOnLine . 2020,第2期

机译：学习非线性强大控制作为用于活动悬架系统的数据驱动的零和双玩家游戏
2. Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems [J] . Yasini Sholeh, Karimpour Ali, Sistani Mohammad-Bagher Naghibi, International Journal of Adaptive Control and Signal Processing . 2015,第4期

机译：在线并发强化学习算法，用于求解部分未知的非线性连续时间系统的两人零和游戏
3. LL_2, a simple reinforcement learning scheme for two-player zero-sum Markov games [J] . Benoit Frenay, Marco Saerens Neurocomputing . 2009,第7a9期

机译：LL_2，一种用于两人零和马尔可夫游戏的简单强化学习方案
4. Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games [C] . Chun Kai Ling, Fei Fang, J. Zico Kolter AAAI Conference on Artificial Intelligence . 2019

机译：两位玩家零和游戏中的代理理性的大规模学习
5. Deception in two-player zero-sum stochastic games: Theory and application to warfare games. [D] . Singh, Rajdeep. 2006

机译：两人零和随机游戏中的欺骗：理论和在战争游戏中的应用。
6. Spike-based Decision Learning of Nash Equilibria in Two-Player Games [O] . Johannes Friedrich, Walter Senn 2012

机译：两人游戏中基于纳什均衡的基于峰值的决策学习
7. Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games [O] . Chun Kai Ling, Fei Fang, J. Zico Kolter 2019

机译：两位玩家零和游戏中的代理理性的大规模学习

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

摘要

著录项

相似文献

相关主题

期刊订阅