首页> 外文会议>International Joint Conference on Neural Networks >Novelty-Guided Reinforcement Learning via Encoded Behaviors
【24h】

Novelty-Guided Reinforcement Learning via Encoded Behaviors

机译:通过编码行为的新颖引导强化学习

获取原文

摘要

Despite the successful application of Deep Reinforcement Learning (DRL) in a wide range of complex tasks, agents either often learn sub-optimal behavior due to the sparse/deceptive nature of rewards or require a lot of interactions with the environment. Recent methods combine a class of algorithms known as Novelty Search (NS), which circumvents this problem by encouraging exploration towards novel behaviors. Even without exploiting any environment rewards, they are capable of learning skills that yield competitive results in several tasks. However, to assign novelty scores to policies, these methods rely on neighborhood models that store behaviors in an archive set. Hence they do not scale and generalize to complex tasks requiring too many policy evaluations. Addressing these challenges, we propose a function approximation paradigm to instead learn sparse representations of agent behaviors using auto-encoders, which are later used to assign novelty scores to policies. Experimental results on benchmark tasks suggest that this way of novelty-guided exploration is a viable alternative to classic novelty search methods.
机译:尽管深度强化学习(DRL)在各种复杂任务中得到了成功应用,但由于奖励的稀疏/欺骗性,代理商经常会学习次优的行为,或者需要与环境进行大量互动。最近的方法结合了称为“新颖性搜索”(NS)的一类算法,该算法通过鼓励探索新颖行为来规避此问题。即使没有利用任何环境奖励,他们也能够学习在多项任务中产生竞争性结果的技能。但是,为了给策略分配新颖性分数,这些方法依赖于将行为存储在档案集中的邻域模型。因此,它们无法扩展并推广到需要太多策略评估的复杂任务。为解决这些挑战,我们提出了一种函数逼近范式,以代替使用自动编码器来学习代理行为的稀疏表示,然后将其用于为策略分配新颖性评分。关于基准任务的实验结果表明,这种新颖性指导的探索方法是经典新颖性搜索方法的可行替代方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号