首页> 外文会议>IEEE International Conference on Robotics and Biomimetics >Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration
【24h】

Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration

机译:学习中断:高效探索的分层深度强化学习框架

获取原文
获取外文期刊封面目录资料

摘要

To achieve scenario intelligence, humans must transfer knowledge to robots by developing goal-oriented algorithms, which are sometimes insensitive to dynamically changing environments. While deep reinforcement learning achieves significant success recently, the long and difficult training process limits its application. In this paper, we propose a hybrid structure named Option-Interruption in which human knowledge is embedded into a hierarchical reinforcement learning framework. Our architecture has two key components: options, represented by existing human-designed methods, can significantly speed up the training process and interruption mechanism, based on learnable termination functions, enables our system to timely terminate the current option according to the external environment. To implement this architecture, we derive a set of update rules based on policy gradient methods and present a complete training process. In the experiment part, our method is evaluated in two simulated tasks, Four-room navigation and exploration task, which shows the efficiency and flexibility of our framework.
机译:为了实现情景智能,人类必须通过开发面向目标的算法将知识传递给机器人,该算法有时对动态变化的环境不敏感。虽然深度强化学习最近取得了巨大的成功,但漫长而艰巨的培训过程限制了其应用。在本文中,我们提出了一种名为Option-Interruption的混合结构,其中将人类知识嵌入到分层强化学习框架中。我们的体系结构具有两个关键组成部分:以现有的人工设计方法为代表的选项,可以基于可学习的终止功能显着加快培训过程和中断机制,使我们的系统能够根据外部环境及时终止当前的选项。为了实现此体系结构,我们基于策略梯度方法派生了一组更新规则,并提出了完整的培训过程。在实验部分,我们的方法在两个模拟任务(四室导航和探索任务)中进行了评估,这表明了我们框架的效率和灵活性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号