首页> 外文会议>International conference on artificial intelligence and soft computing >Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games
【24h】

Opponent Modelling by Sequence Prediction and Lookahead in Two-Player Games

机译:两人游戏中基于序列预测和超前的对手建模

获取原文

摘要

Learning a strategy that maximises total reward in a multi-agent system is a hard problem when it depends on other agents' strategies. Many previous approaches consider opponents which are reactive and memoryless. In this paper, we use sequence prediction algorithms to perform opponent modelling in two-player games, to model opponents with memory. We argue that to compete with opponents with memory, lookahead is required. We combine these algorithms with reinforcement learning and lookahead action selection, allowing them to find strategies that maximise total reward up to a limited depth. Experiments confirm lookahead is required, and show these algorithms successfully model and exploit opponent strategies with different memory lengths. The proposed approach outperforms popular and state-of-the-art reinforcement learning algorithms in terms of learning speed and final performance.
机译:当学习一种在多主体系统中最大化总回报的策略时,这取决于其他主体的策略,这是一个难题。先前的许多方法都将对手视为反应迟钝且无记忆的对手。在本文中,我们使用序列预测算法在两人游戏中执行对手建模,从而使用记忆对对手进行建模。我们认为要与具有记忆力的对手竞争,需要提前行事。我们将这些算法与强化学习和前瞻性动作选择相结合,使他们能够找到在有限深度范围内最大化总回报的策略。实验确认需要提前进行,并显示这些算法成功地建模和利用了具有不同存储长度的对手策略。在学习速度和最终性能方面,所提出的方法优于流行的和最新的强化学习算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号