首页> 外文会议>Conference on Neural Information Processing Systems >Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards
【24h】

Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards

机译:保持距离:使用自平衡形状奖励解决稀疏奖励任务

获取原文

摘要

While using shaped rewards can be beneficial when solving sparse reward tasks, their successful application often requires careful engineering and is problem specific. For instance, in tasks where the agent must achieve some goal state, simple distance-to-goal reward shaping often fails, as it renders learning vulnerable to local optima. We introduce a simple and effective model-free method to learn from shaped distance-to-goal rewards on tasks where success depends on reaching a goal state. Our method introduces an auxiliary distance-based reward based on pairs of rollouts to encourage diverse exploration. This approach effectively prevents learning dynamics from stabilizing around local optima induced by the naive distance-to-goal reward shaping and enables policies to efficiently solve sparse reward tasks. Our augmented objective does not require any additional reward engineering or domain expertise to implement and converges to the original sparse objective as the agent learns to solve the task. We demonstrate that our method successfully solves a variety of hard-exploration tasks (including maze navigation and 3D construction in a Minecraft environment), where naive distance-based reward shaping otherwise fails, and intrinsic curiosity and reward relabeling strategies exhibit poor performance.
机译:在解决稀疏奖励任务时,使用异形奖励可能是有益的,它们的成功应用程序通常需要仔细的工程,并且是特定的问题。例如,在代理必须实现某些目标状态的任务中,简单的距离到目标奖励塑造通常失败,因为它呈现出易受当地最优的攻击。我们介绍了一种简单有效的无模型方法,从而从成功的距离到目标奖励中学习成功取决于达到目标状态。我们的方法基于对卷展览对引入了基于辅助距离的奖励,以鼓励各种探索。这种方法有效地防止了学习动态稳定了天真距离到目标奖励塑造的局部局部最佳效果,并使政策能够有效地解决稀疏奖励任务。我们的增强目标不需要任何额外的奖励工程或域名专业知识来实施和融合到原始稀疏目标,因为代理学会解决任务。我们证明我们的方法成功解决了各种硬勘探任务(包括MINECRAFT环境中的迷宫导航和3D建筑),其中基于天真的距离的奖励塑造,否则就会出现否则,并且内在的好奇心和奖励重新标记策略表现出较差的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号