Bootstrapping from Game Tree Search

机译：通过游戏树搜索进行引导

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we introduce a new algorithm for updating the parameters of a heuristic evaluation function, by updating the heuristic towards the values computed by an alpha-beta search. Our algorithm differs from previous approaches to learning from search, such as Samuel's checkers player and the TD-Leaf algorithm, in two key ways. First, we update all nodes in the search tree, rather than a single node. Second, we use the outcome of a deep search, instead of the outcome of a subsequent search, as the training signal for the evaluation function. We implemented our algorithm in a chess program Meep, using a linear heuristic function. After initialising its weight vector to small random values, Meep was able to learn high quality weights from self-play alone. When tested online against human opponents, Meep played at a master level, the best performance of any chess program with a heuristic learned entirely from self-play.

机译：在本文中，我们介绍了一种新的算法，可通过朝着由alpha-beta搜索计算出的值更新启发式来更新启发式评估函数的参数。我们的算法在两个关键方面不同于以前的搜索学习方法，例如Samuel的跳棋播放器和TD-Leaf算法。首先，我们更新搜索树中的所有节点，而不是单个节点。其次，我们使用深度搜索的结果而不是后续搜索的结果作为评估功能的训练信号。我们使用线性启发式函数在国际象棋程序Meep中实现了我们的算法。将其权重向量初始化为较小的随机值后，Meep能够仅通过自我演奏就学习高质量的权重。在与人类对手进行在线测试时，米普以大师级水平进行比赛，这是所有象棋程序中表现最佳的一种，其启发式方法完全是从自我游戏中学到的。

著录项

来源
《Conference on Neural Information Processing Systems;Annual conference on Neural Information Processing Systems》|2009年|P.1937-1945|共9页
会议地点
作者
Joel Veness; David Silver; William Uther; Alan Blair;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Evaluation of Game Tree Search Methods by Game Records [J] . Takeuchi S., Kaneko T., Yamaguchi K. Computational Intelligence and AI in Games, IEEE Transactions on . 2010,第4期

机译：通过游戏记录评估游戏树搜索方法
2. An algorithm based on valuation forecasting for game tree search [J] . Tan Guangyun, Wei Peipei, He Yongyi, International journal of machine learning and cybernetics . 2021,第4期

机译：一种基于游戏树搜索估值预测的算法
3. On pruning search trees of impartial games [J] . Piotr Beling, Marek Rogalski Artificial intelligence . 2020,第Juna期

机译：修剪公正游戏的搜索树
4. Bootstrapping Monte Carlo Tree Search with an Imperfect Heuristic [C] . Truong-Huy Dinh Nguyen, Wee-Sun Lee, Tze-Yun Leong European conference on machine learning and knowledge discovery in databases . 2012

机译：使用不完善的启发式方法自举蒙特卡洛树搜索
5. Resource constraint cooperative game with Monte Carlo Tree Search. [D] . Cheng, Chee Chian. 2016

机译：使用蒙特卡洛树搜索进行资源约束合作博弈。
6. Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees [O] . Vladimir Makarenkov, Alix Boc, Jingxin Xie, 2010

机译：加权自举：评估系统发育树稳健性的一种校正方法
7. Bootstrapping Monte Carlo Tree Search with an Imperfect Heuristic [O] . Truong-huy Dinh Nguyen, Wee-sun Lee, Tze-yun Leong 2016

机译：用不完美的启发式引导蒙特卡罗树搜索

Bootstrapping from Game Tree Search

摘要

著录项

相似文献

相关主题

期刊订阅