首页> 外文会议>International Joint Conference on Neural Networks >Novelty-Guided Reinforcement Learning via Encoded Behaviors

【24h】

Novelty-Guided Reinforcement Learning via Encoded Behaviors

机译：通过编码行为的新颖引导强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Despite the successful application of Deep Reinforcement Learning (DRL) in a wide range of complex tasks, agents either often learn sub-optimal behavior due to the sparse/deceptive nature of rewards or require a lot of interactions with the environment. Recent methods combine a class of algorithms known as Novelty Search (NS), which circumvents this problem by encouraging exploration towards novel behaviors. Even without exploiting any environment rewards, they are capable of learning skills that yield competitive results in several tasks. However, to assign novelty scores to policies, these methods rely on neighborhood models that store behaviors in an archive set. Hence they do not scale and generalize to complex tasks requiring too many policy evaluations. Addressing these challenges, we propose a function approximation paradigm to instead learn sparse representations of agent behaviors using auto-encoders, which are later used to assign novelty scores to policies. Experimental results on benchmark tasks suggest that this way of novelty-guided exploration is a viable alternative to classic novelty search methods.

机译：尽管深度强化学习（DRL）在各种复杂任务中得到了成功应用，但由于奖励的稀疏/欺骗性，代理商经常会学习次优的行为，或者需要与环境进行大量互动。最近的方法结合了称为“新颖性搜索”（NS）的一类算法，该算法通过鼓励探索新颖行为来规避此问题。即使没有利用任何环境奖励，他们也能够学习在多项任务中产生竞争性结果的技能。但是，为了给策略分配新颖性分数，这些方法依赖于将行为存储在档案集中的邻域模型。因此，它们无法扩展并推广到需要太多策略评估的复杂任务。为解决这些挑战，我们提出了一种函数逼近范式，以代替使用自动编码器来学习代理行为的稀疏表示，然后将其用于为策略分配新颖性评分。关于基准任务的实验结果表明，这种新颖性指导的探索方法是经典新颖性搜索方法的可行替代方案。

著录项

来源
《International Joint Conference on Neural Networks 》|2020年|1-8|共8页
会议地点
作者
Rajkumar Ramamurthy; Rafet Sifa; Max Lübbering; Christian Bauckhage;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Reinforcement learning; Linear programming; Sociology; Statistics; Search problems; Navigation;

机译：任务分析;强化学习;线性规划;社会学;统计;搜索问题;导航;

相似文献

外文文献
中文文献
专利

1. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis [J] . CollinsA.G.E., FrankM.J. The European Journal of Neuroscience . 2012 ,第7a8期

机译：强化学习中有多少是工作记忆而不是强化学习？行为，计算和神经遗传学分析
2. Behavior acquisition of an autonomous robot by reinforcement learning based on globally coupled chaotic system (2nd report, learning navigational behaviors in dynamic environment) [J] . Yoichiro Nakamura, Kazuhiro Ohkura, Kanji Ueda 日本機械学会論文集. C編 . 1997 ,第615期

机译：通过基于全局耦合混沌系统的强化学习获得自主机器人的行为（第二份报告，学习动态环境中的导航行为）
3. The relation between reinforcement learning parameters and the influence of reinforcement history on choice behavior [J] . Katahira Kentaro Journal of Mathematical Psychology . 2015 ,第Null期

机译：强化学习参数与强化历史对选择行为的影响之间的关系
4. Learning to Coordinate Behaviors in Soft Behavior-Based Systems Using Reinforcement Learning [C] . Azar M.G., Ahmadabadi M.N., Farahmand A.M., . -1

机译：使用强化学习学习在基于软行为的系统中协调行为
5. Explaining Collective Behavior with Dynamical Systems: Spatial Gradient Sensing in Eukaryotic Chemotaxis and Learning Dynamics in Multiagent Reinforcement Learning [D] . Shams, Daniel . 2019

机译：用动力系统解释集体行为：多核化趋化性的空间梯度传感和多核强化学习中的学习动态
6. How much of reinforcement learning is working memory not reinforcement learning? A behavioral computational and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank -1

机译：钢筋学习多少是工作记忆而不是加强学习？行为计算和神经肝分析
7. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank 2012

机译：钢筋学习多少是工作记忆，而不是加强学习？行为，计算和神经肝分析

Novelty-Guided Reinforcement Learning via Encoded Behaviors

摘要

著录项

相似文献

相关主题

期刊订阅