首页> 外文会议>2017 Intelligent Systems Conference >Exploiting action categories in learning complex games
【24h】

Exploiting action categories in learning complex games

机译:在学习复杂游戏中利用动作类别

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a model for planning in a highly complex game, where certain action types are more common than others and cyclic behaviour can also easily arise. These issues are addressed by exploiting the inherent structure among the possible options to enhance the online learning algorithm: sampling during Monte Carlo Tree Search becomes a two step process, by first sampling from a distribution over the types of legal actions followed by sampling from individual actions of the chosen type. This policy drastically reduces the breadth of the rollout as well as its depth by avoiding redundant sampling behaviour. The result is a large increase in both the performance and efficiency of the model. Another contribution of this paper is assessing the benefits of a parallel implementation and afterstates in complex games. Evaluation is done via agent simulations in the board game Settlers of Catan. The resulting agent is the first based on purely online learning strategies that can handle the full set of legal actions of the game. The evaluation shows that our model outperforms previous state-of-the-art agents while taking decisions in a time threshold tolerated by human opponents.
机译:本文提出了一种在高度复杂的游戏中进行规划的模型,其中某些动作类型比其他动作类型更为常见,并且循环行为也很容易出现。通过利用可能的选项中的固有结构来增强在线学习算法,可以解决这些问题:蒙特卡洛树搜索期间的采样成为一个两步过程,首先从合法行为类型的分布中采样,然后从单个行为中采样所选类型的。通过避免重复的采样行为,此策略极大地降低了部署的宽度和深度。结果大大提高了模型的性能和效率。本文的另一项贡献是评估了复杂游戏中并行实现和后状态的好处。通过棋盘游戏《卡坦的定居者》中的特工模拟来进行评估。最终的代理商是第一个基于纯粹在线学习策略的代理商,该策略可以处理游戏的所有法律诉讼。评估显示,我们的模型在人类对手可以忍受的时间阈值内做出决策的同时,胜过了以往的最新代理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号