【24h】

Comparing Reward Shaping, Visual Hints, and Curriculum Learning

机译:比较奖励塑造,视觉提示和课程学习

获取原文

摘要

When considering how to reduce the learning effort required for Reinforcement Learning (RL) agents on complex tasks, designers can apply several common approaches. Reward shaping boosts the immediate reward provided by the environment, effectively encouraging (or discouraging) specific actions. Curriculum learning (Bengio et al. 2009) aims to help an agent learn a complex task by learning a sequence of simpler tasks. Hints may also be provided (e.g., a yellow brick road), which fall outside the notion of shaping or a curricula. Despite the prevalence of these approaches, few studies examine how they compare to (or complement) each other or when an approach is better.
机译:在考虑如何降低复杂任务中加强学习(RL)代理所需的学习努力,设计人员可以应用几种常见方法。 奖励塑造提高了环境提供的直接奖励,有效地鼓励(或劝阻)具体行动。 课程学习(Bengio等,2009)旨在帮助代理通过学习一系列更简单的任务来学习复杂任务。 还可以提供提示(例如,黄砖路),其落在塑造或课程的概念之外。 尽管这些方法存在普遍性,但很少有研究审查他们如何相互比较(或补充)或者当一种方法更好时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号