首页> 外文期刊>Neurocomputing >Value iteration based integral reinforcement learning approach for H∞ controller design of continuous-time nonlinear systems
【24h】

Value iteration based integral reinforcement learning approach for H∞ controller design of continuous-time nonlinear systems

机译:连续非线性系统H∞控制器设计的基于值迭代的积分强化学习方法

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, a novel integral reinforcement learning approach is developed based on value iteration (VI) for designing the H-infinity controller of continuous-time (CT) nonlinear systems. First, the VI learning mechanism is introduced to solve the zero-sum game problems, which is equivalent to the Hamilton-Jacobi-Isaacs (HJI) equation arising in H-infinity control problems. Since the proposed method is based on VI learning mechanism, it does not require the admissible control for the implementation, and thus satisfies a more general initial condition than the works based on policy iteration (PI). The iterative property of the value function is analysed with an arbitrary initial positive function, and the H-infinity controller can be derived as the iteration converges. For the implementation of the proposed method, three neural networks are introduced to approximate the iterative value function, the iterative control policy and the iterative disturbance policy, respectively. To verify the effectiveness of the VI based method, a linear case and a nonlinear case are presented, respectively. (C) 2018 Elsevier B.V. All rights reserved.
机译:在本文中,基于值迭代(VI)开发了一种新颖的积分强化学习方法,用于设计连续时间(CT)非线性系统的H-无限控制器。首先,引入VI学习机制来解决零和博弈问题,该问题等同于H无限控制问题中出现的汉密尔顿-雅各比-艾萨克斯(HJI)方程。由于所提出的方法是基于VI学习机制的,因此它不需要实现所允许的控制,因此与基于策略迭代(PI)的工作相比,可以满足更一般的初始条件。使用任意初始正函数分析值函数的迭代特性,并且随着迭代收敛,可以推导H无穷大控制器。为了实现该方法,引入了三个神经网络分别逼近迭代值函数,迭代控制策略和迭代干扰策略。为了验证基于VI的方法的有效性,分别给出了线性情况和非线性情况。 (C)2018 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2018年第12期|51-59|共9页
  • 作者单位

    Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Liaoning, Peoples R China;

    Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Liaoning, Peoples R China;

    Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Liaoning, Peoples R China;

    Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Liaoning, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Value iteration; H-infinity control; Reinforcement learning; Continuous-time systems;

    机译:价值迭代;H无限控制;强化学习;连续时间系统;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号