...
首页> 外文期刊>Computational Intelligence Magazine, IEEE >Intelligent Agents for the Game of Go
【24h】

Intelligent Agents for the Game of Go

机译:围棋游戏的智能代理

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Monte-Carlo Tree Search (MCTS) was recently proposed [1, 2, 3] for decision taking in discrete time control problems. It was applied very efficiently to games [4, 5, 6, 7, 8] but also to planning problems and fundamental artificial intelligence tasks [9, 10]. It clearly outperformed alpha-beta techniques when there was no human expertise easy to encode in a value function. In this section, we will describe MCTS and how it allowed great improvements for computer Go. Section II shows the strengths and limitations of MCTS, and in particular, the lack of learning. There are, however, a few known techniques for introducing learning: Rapid-Action Value Estimate (RAVE) and learnt patterns (both well-known now, and discussed below); our focus is on more recent and less widely-known learning techniques introduced in MCTS. The next two sections will show these less standard applications of supervised learning within MCTS: Section III will show how to use past games for improving future games, and section IV will show the inclusion of learning inside a given MCTS run. Section V will be the conclusion.
机译:最近提出了蒙特卡洛树搜索(MCTS)[1、2、3],用于离散时间控制问题中的决策。它非常有效地应用于游戏[4、5、6、7、8],而且还用于计划问题和基本的人工智能任务[9、10]。当没有人的专业知识可轻易在值函数中进行编码时,它显然胜过了alpha-beta技术。在本节中,我们将描述MCTS以及它如何为Go电脑带来巨大的改进。第二部分显示了MCTS的优势和局限性,尤其是缺乏学习。但是,有几种引入学习的已知技术:快速行动价值估计(RAVE)和学习模式(现在都众所周知,下面将进行讨论);我们的重点是MCTS中引入的最新的和鲜为人知的学习技术。接下来的两个部分将展示MCTS中监督学习的这些较不标准的应用:第三部分将展示如何使用过去的游戏来改进未来的游戏,第四部分将展示将学习包含在给定的MCTS运行中。第五节将得出结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号