首页> 外文会议>European conference on applications of evolutionary computation >Fast Evolutionary Adaptation for Monte Carlo Tree Search
【24h】

Fast Evolutionary Adaptation for Monte Carlo Tree Search

机译:蒙特卡罗树搜索的快速进化适应

获取原文

摘要

This paper describes a new adaptive Monte Carlo Tree Search (MCTS) algorithm that uses evolution to rapidly optimise its performance. An evolutionary algorithm is used as a source of control parameters to modify the behaviour of each iteration (i.e. each simulation or roll-out) of the MCTS algorithm; in this paper we largely restrict this to modifying the behaviour of the random default policy, though it can also be applied to modify the tree policy. This method of tightly integrating evolution into the MCTS algorithm means that evolutionary adaptation occurs on a much faster time-scale than has previously been achieved, and addresses a particular problem with MCTS which frequently occurs in real-time video and control problems: that uniform random roll-outs may be uninformative. Results are presented on the classic Mountain Car reinforcement learning benchmark and also on a simplified version of Space Invaders. The results clearly demonstrate the value of the approach, significantly outperforming "standard" MCTS in each case. Furthermore, the adaptation is almost immediate, with no perceptual delay as the system learns: the agent frequently performs well from its very first game.
机译:本文介绍了一种新的自适应蒙特卡洛树搜索(MCTS)算法,该算法使用进化算法来快速优化其性能。进化算法用作控制参数的来源,以修改MCTS算法的每次迭代(即每次模拟或推出)的行为;在本文中,尽管它也可以用于修改树策略,但我们在很大程度上将其限制为修改随机默认策略的行为。这种将进化紧密集成到MCTS算法中的方法意味着进化适应发生的时间尺度比以前实现的要快得多,并且解决了MCTS的一个特定问题,该问题经常出现在实时视频和控制问题中:均匀的随机性推出可能没有任何意义。结果在经典的Mountain Car强化学习基准上以及简化版的Space Invaders中进行了介绍。结果清楚地证明了该方法的价值,在每种情况下均明显优于“标准” MCTS。此外,适应过程几乎是即时的,并且不会随着系统的学习而引起感知延迟:代理从其最初的游戏开始就经常表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号