Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games

机译：两人游戏中基于序列预测和超前的对手建模

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Learning a strategy that maximises total reward in a multi-agent system is a hard problem when it depends on other agents' strategies. Many previous approaches consider opponents which are reactive and memoryless. In this paper, we use sequence prediction algorithms to perform opponent modelling in two-player games, to model opponents with memory. We argue that to compete with opponents with memory, lookahead is required. We combine these algorithms with reinforcement learning and lookahead action selection, allowing them to find strategies that maximise total reward up to a limited depth. Experiments confirm lookahead is required, and show these algorithms successfully model and exploit opponent strategies with different memory lengths. The proposed approach outperforms popular and state-of-the-art reinforcement learning algorithms in terms of learning speed and final performance.

机译：当学习一种在多主体系统中最大化总回报的策略时，这取决于其他主体的策略，这是一个难题。先前的许多方法都将对手视为反应迟钝且无记忆的对手。在本文中，我们使用序列预测算法在两人游戏中执行对手建模，从而使用记忆对对手进行建模。我们认为要与具有记忆力的对手竞争，需要提前行事。我们将这些算法与强化学习和前瞻性动作选择相结合，使他们能够找到在有限深度范围内最大化总回报的策略。实验确认需要提前进行，并显示这些算法成功地建模和利用了具有不同存储长度的对手策略。在学习速度和最终性能方面，所提出的方法优于流行的和最新的强化学习算法。

著录项

来源
《International conference on artificial intelligence and soft computing》|2013年|385-396|共12页
会议地点
作者
Richard Mealing; Jonathan L. Shapiro;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Opponent Modelling; Sequence Prediction; Lookahead; Reinforcement Learning; Multi-Agent Learning; Game Theory;

机译：对手建模;序列预测;展望;强化学习;多主体学习;博弈论;

相似文献

外文文献
中文文献
专利

1. Opponent Modeling by Expectation–Maximization and Sequence Prediction in Simplified Poker [J] . Richard Mealing, Jonathan L. Shapiro Computational Intelligence and AI in Games, IEEE Transactions on . 2017,第1期

机译：期望扑克中的对手建模—最大化和序列预测
2. A Game Al Production Shell Framework: Generating Al Opponents For Geomorphic-lsometric Strategy Games Via Modeling Of Expert Player Intuition [J] . Andrew Chiou Australian journal of intelligent information processing systems . 2007,第4期

机译：游戏铝生产外壳框架：通过专家玩家直觉建模为地貌测绘策略游戏生成铝对手
3. Difference of reciprocity effect in two coevolutionary models of presumed two-player and multiplayer games [J] . Jun Tanimoto PHYSICAL REVIEW E . 2013,第6期

机译：假定的两人游戏和多人游戏的两种协同进化模型中的互惠效应差异
4. Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games [C] . Richard Mealing, Jonathan L. Shapiro International Conference on Artificial Intelligence and Soft Computing . 2013

机译：双手游戏中的序列预测和寻找的对手建模
5. Computational Methods of Hidden Markov Models With Respect To CpG Island Prediction in DNA Sequences. [D] . Ortega, Roberto Angel, Jr. 2011

机译：关于DNA序列中CpG岛预测的隐马尔可夫模型的计算方法。
6. Modified Asano-Ohya-Khrennikov quantum-like model fordecision-making process in a two-player game with nonlinear self- and cross-interactionterms of brain’s amygdala and prefrontal-cortex [O] . Luluk Muthoharoh, Hendradi Hardhienata, Husin Alatas 2020

机译：改进的asano-ohya-khrennikov量子般的模型双人游戏中的决策过程具有非线性自我和交叉交互大脑杏仁杆菌和前额外-Coltex的条款
7. Partial observability during predictions of the opponent's movements in an RTS game [O] . Butler S, Demiris Y 2010

机译：在预测RTs游戏中对手的动作时的部分可观察性
8. Outperforming Game Theoretic Play with Opponent Modeling in Two Player Dominoes. [R] . Myers, M. M. 2014

机译：两个玩家多米诺骨牌中对手建模的优于游戏理论游戏。

Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅