Monte Carlo Tree Search (MCTS) has recently been successfully used to createstrategies for playing imperfect-information games. Despite its popularity,there are no theoretic results that guarantee its convergence to a well-definedsolution, such as Nash equilibrium, in these games. We partially fill this gapby analysing MCTS in the class of zero-sum extensive-form games withsimultaneous moves but otherwise perfect information. The lack of informationabout the opponent's concurrent moves already causes that optimal strategiesmay require randomization. We present theoretic as well as empiricalinvestigation of the speed and quality of convergence of these algorithms tothe Nash equilibria. Primarily, we show that after minor technicalmodifications, MCTS based on any (approximately) Hannan consistent selectionfunction always converges to an (approximate) subgame perfect Nash equilibrium.Without these modifications, Hannan consistency is not sufficient to ensuresuch convergence and the selection function must satisfy additional properties,which empirically hold for the most common Hannan consistent algorithms.
展开▼