首页> 外文期刊>Neurocomputing >Self-teaching adaptive dynamic programming for Gomoku
【24h】

Self-teaching adaptive dynamic programming for Gomoku

机译:五子棋自学习自适应动态编程

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper adaptive dynamic programming (ADP) is applied to learn to play Gomoku. The critic network is used to evaluate board situations. The basic idea is to penalize the last move taken by the loser and reward the last move selected by the winner at the end of a game. The results show that the presented program is able to improve its performance by playing against itself and has approached the candidate level of a commercial Gomoku program called 5-star Gomoku. We also examined the influence of two methods for generating games: self-teaching and learning through watching two experts playing against each other and presented the comparison results and reasons.
机译:本文采用自适应动态规划(ADP)技术来学习玩五子棋。评论家网络用于评估董事会情况。基本思想是惩罚输家的最后一步,并在比赛结束时奖励获胜者选择的最后一步。结果表明,所提出的程序能够通过与自身竞争来提高其性能,并已接近称为五星级Gomoku的商业Gomoku程序的候选级别。我们还研究了两种生成游戏方法的影响:自我教学和通过观看两位专家互相对战来学习,并给出了比较结果和原因。

著录项

  • 来源
    《Neurocomputing》 |2012年第1期|p.23-29|共7页
  • 作者单位

    State Key Laboratory of Intelligent Control and Management of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    State Key Laboratory of Intelligent Control and Management of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    State Key Laboratory of Intelligent Control and Management of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    gomoku; reinforcement learning; adaptive dynamic programming; temporal difference learning; neural network;

    机译:五子棋;强化学习;自适应动态规划时间差异学习;神经网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号