首页> 外文期刊>Neural computation >TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
【24h】

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

机译:TD-Gammon,一个自学西洋双陆棋程序,达到了大师级水平

获取原文
获取原文并翻译 | 示例

摘要

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(λ) reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a “raw” description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of hand-crafted features is added to the network's input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.
机译:TD-Gammon是一个神经网络,它能够基于TD(λ)强化学习算法(Sutton 1988),仅通过与自己进行对抗并从结果中学习来自学双陆棋。尽管从随机初始权重开始(因此也从随机初始策略开始),TD-Gammon仍实现了令人惊讶的强劲游戏水平。在学习之初就建立了零知识(即,仅给出了对板状态的“原始”描述),网络便学会了在较高的中间水平上进行游戏。此外,当将一组手工制作的功能添加到网络的输入表示中时,结果是达到了真正惊人的性能水平:现在估计最新版本的TD-Gammon可以在非常接近主机的强大水平上发挥作用。世界上最好的人类选手。

著录项

  • 来源
    《Neural computation》 |1994年第2期|215-219|共5页
  • 作者

    Tesauro G;

  • 作者单位

    IBM Thomas J. Watson Research Center, P. O. Box 704, Yorktown Heights, NY 10598 USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号