首页> 外文期刊>Neurocomputing >Pseudo-rehearsal: Achieving deep reinforcement learning without catastrophic forgetting
【24h】

Pseudo-rehearsal: Achieving deep reinforcement learning without catastrophic forgetting

机译:伪排练:在没有灾难性的遗忘的情况下实现深度加强学习

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Neural networks can achieve excellent results in a wide variety of applications. However, when they attempt to sequentially learn, they tend to learn the new task while catastrophically forgetting previous ones. We propose a model that overcomes catastrophic forgetting in sequential reinforcement learning by combining ideas from continual learning in both the image classification domain and the reinforcement learning domain. This model features a dual memory system which separates continual learning from reinforcement learning and a pseudo-rehearsal system that "recalls" items representative of previous tasks via a deep generative network. Our model sequentially learns Atari 2600 games without demonstrating catastrophic forgetting and continues to perform above human level on all three games. This result is achieved without: demanding additional storage requirements as the number of tasks increases, storing raw data or revisiting past tasks. In comparison, previous state-of-the-art solutions are substantially more vulnerable to forgetting on these complex deep reinforcement learning tasks. (C) 2020 Elsevier B.V. All rights reserved.
机译:神经网络可以在各种应用中实现优异的结果。然而,当他们试图顺序学习时,他们倾向于在灾难性地忘记以前的时学习新任务。我们提出了一种模型,通过在图像分类领域和加强学习领域的连续学习中结合思路来克服灾难性的遗忘。该模型具有双存储器系统,该系统将持续学习与强化学习和伪排练系统分开,即“召回”通过深生成网络代表先前任务的项目。我们的模式顺序地学习Atari 2600游戏,而不会展示灾难性的遗忘,并继续在所有三场比赛上执行以上人类水平。此结果是实现的,而无需:当任务数量增加,存储原始数据或重新审视过去任务时,要求额外的存储要求。相比之下,以前的最先进的解决方案很容易忘记这些复杂的深度加强学习任务。 (c)2020 Elsevier B.v.保留所有权利。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号