首页> 外文期刊>IEICE transactions on information and systems >Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
【24h】

Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach

机译:使用Sarsa和Q学习进行深度强化学习:一种混合方法

获取原文
           

摘要

The commonly used Deep Q Networks is known to overestimate action values under certain conditions. It's also proved that overestimations do harm to performance, which might cause instability and divergence of learning. In this paper, we present the Deep Sarsa and Q Networks (DSQN) algorithm, which can considered as an enhancement to the Deep Q Networks algorithm. First, DSQN algorithm takes advantage of the experience replay and target network techniques in Deep Q Networks to improve the stability of neural networks. Second, double estimator is utilized for Q-learning to reduce overestimations. Especially, we introduce Sarsa learning to Deep Q Networks for removing overestimations further. Finally, DSQN algorithm is evaluated on cart-pole balancing, mountain car and lunarlander control task from the OpenAI Gym. The empirical evaluation results show that the proposed method leads to reduced overestimations, more stable learning process and improved performance.
机译:众所周知,常用的Deep Q Networks在某些情况下会高估动作值。还证明了高估确实会损害绩效,这可能会导致学习的不稳定和分歧。在本文中,我们提出了Deep Sarsa和Q网络(DSQN)算法,可以将其视为对Deep Q Networks算法的增强。首先,DSQN算法利用Deep Q网络中的经验重播和目标网络技术来提高神经网络的稳定性。其次,将双重估计器用于Q学习以减少过高估计。特别是,我们将Sarsa学习引入Deep Q Networks,以进一步消除高估。最后,在OpenAI Gym的手杖平衡,山地车和月球人控制任务上评估了DSQN算法。实证评估结果表明,所提出的方法减少了过高的估计,使学习过程更加稳定,并提高了性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号