首页> 外文会议>IEEE/RSJ International Workshop on Intelligent Robots and Systems >Coarse planning for landmark navigation in a neural-network reinforcement-learning robot
【24h】

Coarse planning for landmark navigation in a neural-network reinforcement-learning robot

机译:神经网络强化学习机器人中地标导航粗略规划

获取原文

摘要

Is it possible to plan at a coarse level and act at a fine level with a neural-network (NN) reinforcement-learning (RL) planner? This work presents a NN planner, used to control a simulated robot in a stochastic landmark-navigation problem, which plans at an abstract level. The controller has both reactive components, based on actor-critic RL, and planning components inspired by the Dyna-PI architecture (this roughly corresponds to RL plus a model of the environment). Coarse planning is based on macro-actions defined as a sequence of identical primitive actions. It updates the evaluations and the action policy while generating simulated experience at the macro level with the model of the environment (a NN trained at the macro level). The simulations show how the controller works. They also show the advantages of using a discount coefficient tuned to the level of planning coarseness, and suggest that discounted RL has problems in dealing with long periods of time.
机译:是否有可能以粗略的水平计划,并用神经网络(NN)加强学习(RL)策划仪在良好水平上行动?这项工作提供了一个NN规划人员,用于控制在随机地标导航问题中的模拟机器人,在抽象水平中计划。控制器具有基于演员 - 评论仪RL的反应性分量,以及由Dyna-PI架构启发的计划组件(这大致对应于RL加上环境模型)。粗略规划是基于宏动作定义为相同的原始操作序列。它更新评估和行动策略,同时使用环境模型生成宏观的模拟体验(在宏级别训练的NN)。模拟显示控制器的工作原理。他们还展示了使用折扣系数调整到规划粗糙度水平的优势,并表明折扣RL在处理长时间处理时存在问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号