【24h】

Efficient Exploration by Novelty-Pursuit

机译:高效追求的追求

获取原文

摘要

Efficient exploration is essential to reinforcement learning in tasks with huge state space and long planning horizon. Recent approaches to address this issue include the intrinsically motivated goal exploration processes (IMGEP) and the maximum state entropy exploration (MSEE). In this paper, we propose a goal-selection criterion in IMGEP based on the principle of MSEE, which results in the new exploration method novelty-pursuit. Novelty-pursuit performs the exploration in two stages: first, it selects a seldom visited state as the target for the goal-conditioned exploration policy to reach the boundary of the explored region; then, it takes random actions to explore the non-explored region. We demonstrate the effectiveness of the proposed method in environments from simple maze environments, MuJoCo tasks, to the long-horizon video game of SuperMarioBros. Experiment results show that the proposed method outperforms the state-of-the-art approaches that use curiosity-driven exploration.
机译:高效的探索对于具有巨大国家空间和长期规划地平线的任务中的加强学习至关重要。 最近解决此问题的方法包括本质上积极的目标探索过程(IMGEP)和最大状态熵探索(MSEEE)。 在本文中,我们基于MSEE原理提出了IMGEP中的目标选择标准,这导致新的勘探方法新颖追求。 新奇追求在两个阶段进行勘探:首先,它选择很少访问的国家作为目标条件探索政策的目标,以实现探索区域的边界; 然后,探索未探索区域需要随机操作。 我们展示了从简单迷宫环境,Mujoco任务的环境中提出的方法的有效性,以Supermariobros的长地平视频游戏。 实验结果表明,该方法优于利用好奇心驱动勘探的最先进的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号