首页> 外文会议>AAAI Conference on Artificial Intelligence >Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games
【24h】

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

机译:两位玩家零和游戏中的代理理性的大规模学习

获取原文

摘要

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions. Although a recent work provides a powerful differentiable end-to-end learning frameworks which embed a game solver within a deep-learning framework, allowing unknown game parameters to be learned via backpropagation, this framework faces significant limitations when applied to boundedly rational human agents and large scale problems, leading to poor practicality. In this paper, we address these limitations and propose a framework that is applicable for more practical settings. First, seeking to learn the rationality of human agents in complex two-player zero-sum games, we draw upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning. Second, to scale up to large, real-world scenarios, we propose an efficient first-order primal-dual method which exploits the structure of extensive-form games, yielding significantly faster computation for both game solving and gradient computation. When tested on randomly generated games, we report speedups of orders of magnitude over previous approaches. We also demonstrate the effectiveness of our model on both real-world one-player settings and synthetic data.
机译:随着近期解决大型零和广泛形式的游戏的进步,对推断底层游戏参数的逆问题越来越感兴趣,只有仅权访问代理操作。虽然最近的工作提供了一个强大的可分别的端到端学习框架,但在深度学习的框架内嵌入游戏解决者,允许通过BackPropagation学习未知的游戏参数,此框架在应用于界限的理性人员时面临重大限制大规模问题,导致实用性差。在本文中,我们解决了这些限制,并提出了一个适用于更实际设置的框架。首先,寻求学习在复杂的双球员零和游戏中人体代理的合理性,我们在决策理论中汲取知名的想法,以获得简明和可解释的代理行为模型,并导出终端的求解器和梯度 - 结束学习。其次,扩大到大型,现实世界的情景,我们提出了一种高效的一阶原始方法,利用广泛形式的游戏结构,从而产生了对游戏解决和梯度计算的速度明显更快。在随机生成的游戏上进行测试时,我们向以前的方法报告数量级的加速。我们还展示了我们在真实世界一批播放器设置和合成数据上的模型的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号