首页> 外文期刊>IEEE Transactions on Systems, Man, and Cybernetics >Value Iteration-Based H∞ Controller Design for Continuous-Time Nonlinear Systems Subject to Input Constraints
【24h】

Value Iteration-Based H∞ Controller Design for Continuous-Time Nonlinear Systems Subject to Input Constraints

机译:基于价值迭代的H∞控制器设计,用于输入约束的连续时间非线性系统

获取原文
获取原文并翻译 | 示例

摘要

In this paper, a novel integral reinforcement learning method is proposed based on value iteration (VI) to design the $H_{infty }$ controller for continuous-time nonlinear systems subject to input constraints. To confront the control constraints, a nonquadratic function is introduced to reconstruct the ${L_{2}}$ -gain condition for the $H_{infty }$ control problem. Then, the VI method is proposed to solve the corresponding Hamilton–Jacobi–Isaacs equation initialized with an arbitrary positive semi-definite value function. Compared with most existing works developed based on policy iteration, the initial admissible control policy is no longer required which results in a more free initial condition. The iterative process of the proposed VI method is analyzed and the convergence to the saddle point solution is proved in a general way. For the implementation of the proposed method, only one neural network is introduced to approximate the iterative value function, which results in a simpler architecture with less computational load compared with utilizing three neural networks. To verify the effectiveness of the VI-based method, two nonlinear cases are presented, respectively.
机译:在本文中,提出了一种基于价值迭代(VI)来设计的新型积分增强学习方法<内联公式XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3.org/1999/xlink”> $ h _ { idty} $ 用于连续时间非线性系统的控制器,受输入约束。要面对控制约束,引入了不规则函数以重建<内联公式XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3.org/1999/xlink”> $ {l_ {2}} $ - 赢取条件<内联公式XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3.org/1999/xlink”> $ h _ { idty} $ 控制问题。然后,提出了VI方法来解决与任意的正半定值函数初始化的相应Hamilton-jacobi-isaACS方程。与基于政策迭代开发的大多数现有工程相比,不再需要初始允许的控制策略,从而导致更自由的初始条件。分析所提出的VI方法的迭代过程,并以一般方式证明了对鞍点解决方案的收敛。为了实现所提出的方法,仅引入一个神经网络以近似迭代值函数,这导致更简单的架构,与利用三个神经网络相比具有较少的计算负荷。为了验证基于VI的方法的有效性,分别呈现了两个非线性情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号