Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

机译：有效的蒙特卡罗反事实遗憾最小化与许多球员行动的游戏中

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a smaller, sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for a given player. In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player's actions according to the player's average strategy. Our new algorithm is inspired by a new, tighter bound on the number of iterations required by CFR to converge to a given solution quality. In addition, we prove a similar, tighter bound for AS and other popular MCCFR variants. Finally, we validate our work by demonstrating that AS converges faster than previous MCCFR algorithms in both no-limit poker and Bluff.

机译：反事实遗憾最小化（CFR）是一种流行的迭代算法，用于在广泛的游戏中计算策略。 Monte Carlo CFR（MCCFR）变体通过穿过树的较小，采样部分来减少CFR的迭代时间成本。由于它们对给定播放器的每个动作进行了许多玩家操作，以前的最有效的MCCFR实例仍然非常缓慢。在本文中，我们提出了一种新的MCCFR算法，平均战略采样（AS），其根据玩家的平均策略来对玩家的动作的子集进行采样。我们的新算法通过CFR融合到给定的解决方案质量所需的迭代次数的新的算法启发。此外，我们证明了类似，更严格的界限和其他流行的MCCFR变体。最后，我们通过证明这是比以前的MCCFR算法更快地收敛到无限制扑克和虚张声势中的速度。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2012年||共9页
会议地点
作者
Richard Gibson; Neil Burch; Marc Lanctot; Duane Szafron;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient Regret Minimization in Non-Convex Games [J] . Elad Hazan, Karan Singh, Cyril Zhang JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：非凸游戏中的有效后悔最小化
2. Efficient Regret Minimization in Non-Convex Games [J] . Elad Hazan, Karan Singh, Cyril Zhang JMLR: Workshop and Conference Proceedings . 2017,第2009期

机译：非凸游戏中的有效后悔最小化
3. Monte Carlo Search Algorithm Discovery for Single-Player Games [J] . Maes, F., St-Pierre, Computational Intelligence and AI in Games, IEEE Transactions on . 2013,第3期

机译：单人游戏的蒙特卡洛搜索算法发现
4. Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions [C] . Richard Gibson, Neil Burch, Marc Lanctot, Annual conference on Neural Information Processing Systems . 2012

机译：在具有许多玩家动作的游戏中有效的蒙特卡洛反事实后悔最小化
5. Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games. [D] . Lanctot, Marc. 2013

机译：大型扩展形式博弈中用于均衡计算和决策制定的蒙特卡洛抽样和后悔最小化。
6. Monte Carlo-energy minimization of correolide in the Kv1.3 channel: possible role of potassium ion in ligand-receptor interactions [O] . Iva Bruhova, Boris S Zhorov 2007

机译：Kv1.3通道中的Correolide的Monte Carlo能量最小化：钾离子在配体-受体相互作用中的可能作用
7. Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search [O] . Tom Pepels, Tristan Cazenave, Mark H. M. Winands, 2014

机译：最大限度地减少monte-Carlo树搜索中的简单和累积遗憾

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

摘要

著录项

相似文献

相关主题

期刊订阅