首页> 外文期刊>Neurocomputing >Self-play reinforcement learning with comprehensive critic in computer games
【24h】

Self-play reinforcement learning with comprehensive critic in computer games

机译:在电脑游戏中的综合评论家自助增强学习

获取原文
获取原文并翻译 | 示例
       

摘要

Self-play reinforcement learning, where agents learn by playing with themselves, has been successfully applied in many game scenarios. However, the training procedure for self-play reinforcement learning is unstable and more sample-inefficient than (general) reinforcement learning, especially in imperfect information games. To improve the self-play training process, we incorporate a comprehensive critic into the policy gradient method to form a self-play actor-critic (SPAC) method for training agents to play com-puter games. We evaluate our method in four different environments in both competitive and coopera-tive tasks. The results show that the agent trained with our SPAC method outperforms those trained with deep deterministic policy gradient (DDPG) and proximal policy optimization (PPO) algorithms in many different evaluation approaches, which vindicate the effect of our comprehensive critic in the self-play training procedure. CO 2021 Elsevier B.V. All rights reserved.
机译:在许多游戏场景中成功地应用了自助增强学习,代理商学习,已经成功地应用了许多游戏场景。 然而,自助增强学习的培训程序比(一般)加强学习更不稳定,更像是更高的样品效率,特别是在不完美的信息游戏中。 为了提高自助培训流程,我们将全面的批评融入政策渐变方法,以形成自行运动员 - 评论家(SPAC)方法,用于培训代理商播放COM-PUTER Games。 我们在竞争和合作社任务中评估了四种不同环境中的方法。 结果表明,随着我们的SPAC方法培训的代理商优于许多不同评估方法中具有深度确定性政策梯度(DDPG)和近端政策优化(PPO)算法培训的代理人,这使我们在自助培训中的综合评论家的效果 程序。 CO 2021 elestvier b.v.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2021年第18期|207-213|共7页
  • 作者单位

    Zhejiang Univ Inst Cyber Syst & Control Hangzhou Peoples R China;

    Zhejiang Univ Inst Cyber Syst & Control Hangzhou Peoples R China;

    Zhejiang Univ Inst Cyber Syst & Control Hangzhou Peoples R China;

    Zhejiang Univ Inst Cyber Syst & Control Hangzhou Peoples R China;

    Zhejiang Univ Inst Cyber Syst & Control Hangzhou Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Reinforcement learning; Self-play; Computer game;

    机译:加固学习;自我扮演;电脑游戏;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号