首页> 外文会议>International Conference on Neural Information Processing >Recency-Weighted Acceleration for Continuous Control Through Deep Reinforcement Learning
【24h】

Recency-Weighted Acceleration for Continuous Control Through Deep Reinforcement Learning

机译:通过深度加强学习连续控制的新加权加速

获取原文

摘要

Model-free reinforcement learning algorithms have been successfully applied to continuous control tasks. However, these algorithms suffer from severe instability and high sample complexity. Inspired by Averaged-DQN, this paper proposes a recency-weighted target estimator for actor-critic settings, which will construct a target estimator with more weight placed on recently learned value functions, obtaining a more stable and accurate value estimator. Besides, delaying policy updates with more flexible control is adopted to reduce per-update error because of value function errors. Furthermore, to improve the performance of prioritized experience replay (PER) for continuous control tasks, Phased-PER is proposed to accelerate training in different periods. Experimental results are given to demonstrate that using the same hyper-parameters and architecture the proposed algorithm is more robust and achieves better performance, surpassing the existing methods on a range of continuous control benchmark tasks.
机译:无模型增强学习算法已成功应用于连续控制任务。然而,这些算法患有严重的不稳定性和高样本复杂性。灵感来自平均DQN,本文提出了一种用于演员 - 批评设置的新加权目标估计,它将构建更高权重的目标估计器,以便在最近学识到的值函数上,获得更稳定和准确的值估计器。此外,由于价值函数错误,采用具有更灵活控制的延迟策略更新以减少每次更新错误。此外,为了改善连续控制任务的优先经验重放(PER)的性能,提出了分阶段,以加速不同时期的培训。给出了实验结果表明,使用相同的超参数和架构,所提出的算法更加强大,实现更好的性能,超越了一系列连续控制基准任务的现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号