首页> 外文期刊>Nature >Mastering Atari, Go, chess and shogi by planning with a learned model
【24h】

Mastering Atari, Go, chess and shogi by planning with a learned model

机译:通过规划学习模型来掌握Atari,De,Chess和Shogi

获取原文
获取原文并翻译 | 示例
       

摘要

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess(1) and Go(2), where a perfect simulator is available. However, in real-world problems, the dynamics governing the environment are often complex and unknown. Here we present the MuZero algorithm, which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. The MuZero algorithm learns an iterable model that produces predictions relevant to planning: the action-selection policy, the value function and the reward. When evaluated on 57 different Atari games(3)-the canonical video game environment for testing artificial intelligence techniques, in which model-based planning approaches have historically struggled(4)-the MuZero algorithm achieved state-of-the-art performance. When evaluated on Go, chess and shogi-canonical environments for high-performance planning-the MuZero algorithm matched, without any knowledge of the game dynamics, the superhuman performance of the AlphaZero algorithm(5) that was supplied with the rules of the game.
机译:构建具有规划能力的代理人长期以来一直是追求人工智能的主要挑战之一。基于树的规划方法在挑战领域中取得了巨大的成功,例如国际象棋(1)和Go(2),在那里提供完美的模拟器。然而,在现实世界的问题中,管理环境的动态通常很复杂和未知。在这里,我们介绍了Muzero算法,它通过将基于树的搜索与学习模型相结合,在一系列挑战和视觉上复杂的域中实现了超人的性能,而不知道其底层动态。 Muzero算法了解一个可迭代的模型,它会产生与规划相关的预测:动作选择策略,值函数和奖励。当在57个不同的Atari游戏(3) - 用于测试人工智能技术的规范视频游戏环境时,其中基于模型的计划方法历史上挣扎(4) - Muzero算法实现了最先进的性能。在评估Go,Chess和Shogi-Canonical环境中进行高性能规划 - Muzero算法匹配,没有任何知识的游戏动态,Alphazero算法(5)的超人性能提供了游戏规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号