首页> 外文会议>Conference on Neural Information Processing Systems >Distributional Reward Decomposition for Reinforcement Learning
【24h】

Distributional Reward Decomposition for Reinforcement Learning

机译:增强学习的分布奖励分解

获取原文
获取外文期刊封面目录资料

摘要

Many reinforcement learning (RL) tasks have specific properties that can be leveraged to modify existing RL algorithms to adapt to those tasks and further improve performance, and a general class of such properties is the multiple reward channel. In those environments the full reward can be decomposed into sub-rewards obtained from different channels. Existing work on reward decomposition either requires prior knowledge of the environment to decompose the full reward, or decomposes reward without prior knowledge but with degraded performance. In this paper, we propose Distributional Reward Decomposition for Reinforcement Learning (DRDRL), a novel reward decomposition algorithm which captures the multiple reward channel structure under distributional setting. Empirically, our method captures the multi-channel structure and discovers meaningful reward decomposition, without any requirements on prior knowledge. Consequently, our agent achieves better performance than existing methods on environments with multiple reward channels.
机译:许多增强学习(RL)任务具有可以利用的特定属性来修改现有的RL算法以适应那些任务并进一步提高性能,并且一般类别的此类属性是多个奖励通道。在这些环境中,全部奖励可以分解为从不同渠道获得的子奖励。奖励分解的现有工作需要先前了解环境,以分解完整奖励,或者在未经事先知识的情况下分解奖励,但表现出降低。在本文中,我们提出了加强学习(DRDRL)的分配奖励分解,这是一种新颖的奖励分解算法,其在分布设置下捕获多奖励信道结构。凭经验,我们的方法捕获了多通道结构并发现有意义的奖励分解,没有任何关于先前知识的要求。因此,我们的代理比具有多个奖励渠道的环境的现有方法实现了更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号