...
首页> 外文期刊>IEEJ Transactions on Electrical and Electronic Engineering >Distributed deep reinforcement learning method using profit sharing for learning acceleration
【24h】

Distributed deep reinforcement learning method using profit sharing for learning acceleration

机译:分布式深入强化学习方法使用利润共享进行学习加速

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Profit Sharing (PS), a reinforcement learning method that strongly reinforces successful experiences, has been shown to contribute to the improvement of learning speed when combined with a deep Q-network (DQN). We expect a further improvement in learning speed by integrating PS-based learning and Ape-X DQN that has state-of-the-art learning speed instead of the DQN. However, PS-based learning does not use replay memory. In contrast, the Ape-X DQN requires the use of replay memory because the exploration of the environment for collecting experiences and network training are performed asynchronously. In this study, we propose Learning-accelerated Ape-X, which integrates the Ape-X DQN and PS-based learning with some improvements including the use of replay memory. We show through numerical experiments that the proposed method improves the scores in Atari 2600 video games in a shorter time than the Ape-X DQN. (c) 2020 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
机译:利润分享(PS)是一种强烈增强成功体验的强化学习方法,已被证明可以与深Q-NETWORK(DQN)结合使用,从而有助于提高学习速度。我们希望通过整合具有最先进的学习速度而不是DQN的APE-X DQN来进一步提高学习速度。但是,基于PS的学习不使用重播记忆。相比之下,APE-X DQN需要使用重播记忆,因为对环境的收集体验和网络培训的探索是异步进行的。在这项研究中,我们提出了学习加速的APE-X,该APE-X将APE-X DQN和基于PS的学习集成在一起,并进行了一些改进,包括使用重播记忆。我们通过数值实验表明,所提出的方法在较短的时间内比APE-X DQN在较短的时间内提高了Atari 2600视频游戏的分数。 (c)2020年日本电气工程师研究所。由John Wiley&Sons,Inc。出版

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号