Learning by Playing Solving Sparse Reward Tasks from Scratch

Martin Riedmiller; Roland Hafner; Thomas Lampe; Michael Neunert; Jonas Degrave; Tom Wiele; Vlad Mnih; Nicolas Heess; Jost Tobias Springenberg

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning by Playing Solving Sparse Reward Tasks from Scratch

【24h】

Learning by Playing Solving Sparse Reward Tasks from Scratch

机译：通过从头开始解决稀疏奖励任务来学习

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors - from scratch - in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn simultaneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach.

机译：我们提出了计划辅助控制（SAC-X），这是强化学习（RL）的一种新的学习范例。 SAC-X可以在存在多个稀疏奖励信号的情况下从头开始学习复杂的行为。为此，代理程序配备了一组常规辅助任务，它试图通过非策略RL同时学习。我们方法背后的关键思想是主动（学习）的调度和辅助策略的执行使代理能够有效地探索其环境-使其能够胜任稀疏奖励RL。我们在几种具有挑战性的机器人操纵设置中进行的实验证明了我们方法的强大功能。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第4期|共10页
作者
Martin Riedmiller; Roland Hafner; Thomas Lampe; Michael Neunert; Jonas Degrave; Tom Wiele; Vlad Mnih; Nicolas Heess; Jost Tobias Springenberg;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Learning by Playing Solving Sparse Reward Tasks from Scratch [J] . Martin Riedmiller, Roland Hafner, Thomas Lampe, JMLR: Workshop and Conference Proceedings . 2018,第1期

机译：通过从头开始解决稀疏奖励任务来学习
2. Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward [J] . Jiang Nan, Jin Sheng, Zhang Changshui Neurocomputing . 2019,第Sepa30期

机译：分层自动课程学习：将稀疏奖励导航任务转换为密集奖励
3. Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward [J] . Jiang Nan, Jin Sheng, Zhang Changshui Neurocomputing . 2019,第SEPa30期

机译：分层自动课程学习：将稀疏奖励导航任务转换为密集奖励
4. Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards [C] . Alexander Trott, Caiming Xiong, Stephan Zheng, Conference on Neural Information Processing Systems . 2020

机译：保持距离：使用自平衡形状奖励解决稀疏奖励任务
5. Checkpoint Hindsight Experience Replay, Intuitive Application of Domain Knowledge in Reward-sparse Environments [D] . Wyss, Eric K. 2020

机译：CheckPoint Hindsight体验重播，直观地在奖励稀疏环境中应用域知识
6. Learning From Loss After Risk: Dissociating Reward Pursuit and Reward Valuation in a Naturalistic Foraging Task [O] . Samantha V. Abram, A. David Redish, Angus W. MacDonald III 2019

机译：从风险之后的损失中学习：在自然主义的觅食任务中分离奖励追求和奖励价值
7. Curriculum Learning Based on Reward Sparseness for Deep Reinforcement Learning of Task Completion Dialogue Management [O] . Atsushi Saito 2018

机译：基于奖励稀疏的课程学习，以对对话管理的深度加固学习
8. Learning Task Sequences from Scratch: Applications to the Control of Tools and Toys by a Humanoid Robot [R] . Arsenio, A. M. 2004

机译：从头开始学习任务序列：由人形机器人控制工具和玩具的应用

Learning by Playing Solving Sparse Reward Tasks from Scratch

摘要

著录项

相似文献

相关主题

期刊订阅