Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

Kovarik Vojtech; Lisy Viliam

首页> 外文期刊>Machine Learning >Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

【24h】

Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

机译：同时移动游戏中蒙特卡罗树搜索的汉南一致性选择分析。

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Hannan consistency, or no external regret, is a key concept for learning in games. An action selection algorithm is Hannan consistent (HC) if its performance is eventually as good as selecting the best fixed action in hindsight. If both players in a zero-sum normal form game use a Hannan consistent algorithm, their average behavior converges to a Nash equilibrium of the game. A similar result is known about extensive form games, but the played strategies need to be Hannan consistent with respect to the counterfactual values, which are often difficult to obtain. We study zero-sum extensive form games with simultaneous moves, but otherwise perfect information. These games generalize normal form games and they are a special case of extensive form games. We study whether applying HC algorithms in each decision point of these games directly to the observed payoffs leads to convergence to a Nash equilibrium. This learning process corresponds to a class of Monte Carlo Tree Search algorithms, which are popular for playing simultaneous-move games but do not have any known performance guarantees. We show that using HC algorithms directly on the observed payoffs is not sufficient to guarantee the convergence. With an additional averaging over joint actions, the convergence is guaranteed, but empirically slower. We further define an additional property of HC algorithms, which is sufficient to guarantee the convergence without the averaging and we empirically show that commonly used HC algorithms have this property.

机译：汉南语的一致性，或没有外部遗憾，是游戏学习的关键概念。如果动作选择的性能最终与事后选择最佳固定动作一样好，则该动作选择算法为Hannan一致性（HC）。如果零和范式博弈中的两个玩家都使用Hannan一致性算法，则他们的平均行为会收敛到博弈的Nash平衡。关于广泛形式的游戏，也有类似的结果，但是就反事实价值而言，所采用的策略必须是汉南一致的，而反事实价值往往很难获得。我们研究具有同步动作的零和博弈广泛形式游戏，但同时具有完善的信息。这些游戏概括了普通形式的游戏，它们是广泛形式的游戏的特例。我们研究了在这些游戏的每个决策点中将HC算法直接应用于观察到的收益是否会导致收敛至纳什均衡。该学习过程对应于一类蒙特卡洛树搜索算法，该算法在玩同时移动游戏时很流行，但没有任何已知的性能保证。我们表明，直接在观测到的收益上使用HC算法不足以保证收敛。通过对联合动作进行额外的平均，可以保证收敛，但是从经验上讲会比较慢。我们进一步定义了HC算法的附加属性，该属性足以保证收敛而无需取平均，并且我们通过经验证明常用的HC算法具有此属性。

著录项

来源
《Machine Learning》 |2020年第1期|1-50|共50页
作者
Kovarik Vojtech; Lisy Viliam;
展开▼
作者单位

Czech Tech Univ Fac Elect Engn Dept Comp Sci Ctr Artificial Intelligence Zikova 1903-4 Prague 16636 6 Czech Republic;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Nash equilibrium; Extensive form games; Simultaneous moves; Zero sum; Hannan consistency;

机译：纳什均衡;广泛的形式游戏;同时移动;零和;汉南一致性;

相似文献

外文文献
中文文献
专利

1. Self-Adaptive Monte Carlo Tree Search in General Game Playing [J] . Sironi Chiara F., Liu Jialin, Winands Mark H. M. IEEE Transactions on Games . 2020,第2期

机译：自适应蒙特卡罗树搜索在普通游戏中播放
2. Hierarchical Reinforcement Learning With Monte Carlo Tree Search in Computer Fighting Game [J] . Pinto Ivan Pereira, Coutinho Luciano Reis IEEE Transactions on Games . 2019,第3期

机译：蒙特卡洛树搜索在计算机格斗游戏中的分层强化学习
3. Some improvements in Monte Carlo tree search algorithms for sudden death games [J] . Chen Chih-Hung, Wu Wei-Lin, Chen Yu-Heng, ICGA journal . 2018,第4期

机译：蒙特卡洛树用于突然死亡游戏的搜索算法的一些改进
4. Monte Carlo Tree Search variants for simultaneous move games [C] . Tak Mandy J.W., Lanctot Marc, Winands Mark H.M. IEEE Conference on Computational Intelligence and Games . 2014

机译：蒙特卡洛树搜索变体，用于同时移动游戏
5. Move Groups as a General Enhancement for Monte Carlo Tree Search. [D] . Van Eyck, Gabriel. 2014

机译：移动组作为蒙特卡洛树搜索的一项常规改进。
6. Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets [O] . Muhammad Umar Chaudhry, Muhammad Yasir, Muhammad Nabeel Asghar, 2020

机译：基于蒙特卡罗树搜索的递归算法用于高维数据集中的特征选择
7. Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games [O] . Vojtěch Kovařík, Viliam Lisý 2019

机译：同时移动游戏中Monte Carlo树搜索的汉南一致选择

Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅