Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration

机译：学习中断：高效探索的分层深度强化学习框架

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

To achieve scenario intelligence, humans must transfer knowledge to robots by developing goal-oriented algorithms, which are sometimes insensitive to dynamically changing environments. While deep reinforcement learning achieves significant success recently, the long and difficult training process limits its application. In this paper, we propose a hybrid structure named Option-Interruption in which human knowledge is embedded into a hierarchical reinforcement learning framework. Our architecture has two key components: options, represented by existing human-designed methods, can significantly speed up the training process and interruption mechanism, based on learnable termination functions, enables our system to timely terminate the current option according to the external environment. To implement this architecture, we derive a set of update rules based on policy gradient methods and present a complete training process. In the experiment part, our method is evaluated in two simulated tasks, Four-room navigation and exploration task, which shows the efficiency and flexibility of our framework.

机译：为了实现情景智能，人类必须通过开发面向目标的算法将知识传递给机器人，该算法有时对动态变化的环境不敏感。虽然深度强化学习最近取得了巨大的成功，但漫长而艰巨的培训过程限制了其应用。在本文中，我们提出了一种名为Option-Interruption的混合结构，其中将人类知识嵌入到分层强化学习框架中。我们的体系结构具有两个关键组成部分：以现有的人工设计方法为代表的选项，可以基于可学习的终止功能显着加快培训过程和中断机制，使我们的系统能够根据外部环境及时终止当前的选项。为了实现此体系结构，我们基于策略梯度方法派生了一组更新规则，并提出了完整的培训过程。在实验部分，我们的方法在两个模拟任务（四室导航和探索任务）中进行了评估，这表明了我们框架的效率和灵活性。

著录项

来源
《IEEE International Conference on Robotics and Biomimetics》|2018年|648-653|共6页
会议地点
作者
Tingguang Li; Jin Pan; Delong Zhu; Max Q.-H. Meng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Robots; Training; Reinforcement learning; Gradient methods; Navigation; Decision making; Heuristic algorithms;

机译：机器人;培训;强化学习;梯度法;导航;决策;启发式算法;

相似文献

外文文献
中文文献
专利

1. PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning [J] . Li Shilei, Li Meng, Su Jiongming, ACM transactions on intelligent systems and technology . 2021,第3期

机译：PP-PG：将参数扰动与政策梯度方法相结合，为深加固学习中有效和高效的探索
2. Towards integrated dialogue policy learning for multiple domains and intents using Hierarchical Deep Reinforcement Learning [J] . Saha Tulika, Gupta Dhawal, Saha Sriparna, Expert Systems with Application . 2020,第Deca期

机译：利用分层深度加强学习对多个域和意图的综合对话政策学习
3. When Does Communication Learning Need Hierarchical Multi-Agent Deep Reinforcement Learning [J] . Marie Ossenkopf, Mackenzie Jorgensen, Kurt Geihs Cybernetics and Systems . 2019,第5a8期

机译：沟通学习何时需要分层多功能深度加强学习
4. Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration [C] . Tingguang Li, Jin Pan, Delong Zhu, IEEE International Conference on Robotics and Biomimetics . 2018

机译：学习中断：高效勘探的分层深度加强学习框架
5. Hierarchical Deep Reinforcement Learning for Robotics and Data Science [D] . Krishnan, Sanjay. 2018

机译：机器人技术和数据科学的分层深度强化学习
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration [O] . Tingguang Li, Jin Pan, Delong Zhu, 2018

机译：学习中断：高效勘探的分层深度加强学习框架

Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅