首页> 外文期刊>Neurocomputing >Hybrid Hierarchical Reinforcement Learning for online guidance and navigation with partial observability
【24h】

Hybrid Hierarchical Reinforcement Learning for online guidance and navigation with partial observability

机译:混合层次强化学习,用于在线指导和具有部分可观察性的导航

获取原文
获取原文并翻译 | 示例
       

摘要

Autonomous guidance and navigation problems often have high-dimensional spaces, multiple objectives, and consequently a large number of states and actions, which is known as the 'curse of dimensionality'. Furthermore, systems often have partial observability instead of a perfect perception of their environment. Recent research has sought to deal with these problems by using Hierarchical Reinforcement Learning, which often uses same or similar reinforcement learning methods within one application so that multiple objectives can be combined. However, there is not a single learning method that can benefit all targets. To acquire optimal decision-making most efficiently, this paper proposes a hybrid Hierarchical Reinforcement Learning method consisting of several levels, where each level uses various methods to optimize the learning with different types of information and objectives. An algorithm is provided using the proposed method and applied to an online guidance and navigation task. The navigation environments are complex, partially observable, and a priori unknown. Simulation results indicate that the proposed hybrid Hierarchical Reinforcement Learning method, compared to flat or non-hybrid methods, can help to accelerate learning, to alleviate the 'curse of dimensionality' in complex decision-making tasks. In addition, the mixture of relative micro states and absolute macro states can help to reduce the uncertainty or ambiguity at high levels, to transfer the learned results within and across tasks efficiently, and to apply to non-stationary environments. This proposed method can yield a hierarchical optimal policy for autonomous guidance and navigation without a priori knowledge of the system or the environment. (C) 2018 Elsevier B.V. All rights reserved.
机译:自主制导和导航问题通常具有高维空间,多个目标,因此具有大量的状态和动作,这被称为“维数诅咒”。此外,系统通常具有部分可观察性,而不是对其环境的完美感知。最近的研究试图通过使用分层强化学习来解决这些问题,该方法通常在一个应用程序中使用相同或相似的强化学习方法,以便可以组合多个目标。但是,没有一种可以使所有目标受益的学习方法。为了最有效地获得最佳决策,本文提出了一种混合的分层强化学习方法,该方法包括多个级别,其中每个级别使用各种方法来优化具有不同类型信息和目标的学习。使用所提出的方法提供了一种算法,并将其应用于在线指导和导航任务。导航环境复杂,部分可观察且先验未知。仿真结果表明,与平面或非混合方法相比,提出的混合层次强化学习方法可以帮助加速学习,减轻复杂决策任务中的“维数诅咒”。另外,相对微观状态和绝对宏观状态的混合可以帮助减少高水平的不确定性或歧义,有效地在任务内部和任务之间传递学习的结果,并应用于非平稳环境。所提出的方法可以在没有先验知识的系统或环境的情况下产生用于自主制导和导航的分层最优策略。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号