Output-feedback H_∞ quadratic tracking control of linear systems using reinforcement learning

Moghadam Rohollah; Lewis Frank L.

首页> 外文期刊>International Journal of Adaptive Control and Signal Processing >Output-feedback H_∞ quadratic tracking control of linear systems using reinforcement learning

【24h】

Output-feedback H_∞ quadratic tracking control of linear systems using reinforcement learning

机译：基于强化学习的线性系统输出反馈H_∞二次跟踪控制

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents an online learning algorithm based on integral reinforcement learning (IRL) to design an output-feedback (OPFB) H-infinity tracking controller for partially unknown linear continuous-time systems. Although reinforcement learning techniques have been successfully applied to find optimal state-feedback controllers, in most control applications, it is not practical to measure the full system states. Therefore, it is desired to design OPFB controllers. To this end, a general bounded L-2-gain tracking problem with a discounted performance function is used for the OPFB H-infinity tracking. A tracking game algebraic Riccati equation is then developed that gives a Nash equilibrium solution to the associated min-max optimization problem. An IRL algorithm is then developed to solve the game algebraic Riccati equation online without requiring complete knowledge of the system dynamics. The proposed IRL-based algorithm solves an IRL Bellman equation in each iteration online in real time to evaluate an OPFB policy and updates the OPFB gain using the information given by the evaluated policy. An adaptive observer is used to provide the knowledge of the full states for the IRL Bellman equation during learning. However, the observer is not needed after the learning process is finished. A simulation example is provided to verify the convergence of the proposed algorithm to a suboptimal OPFB solution and the performance of the proposed method.

机译：本文提出了一种基于积分增强学习（IRL）的在线学习算法，以设计用于部分未知线性连续时间系统的输出反馈（OPFB）H无穷大跟踪控制器。尽管强化学习技术已成功应用于发现最佳状态反馈控制器，但在大多数控制应用中，测量整个系统状态并不切实际。因此，期望设计OPFB控制器。为此，具有折扣性能函数的一般有界L-2-增益跟踪问题被用于OPFB H无限跟踪。然后，开发了一个跟踪博弈的代数Riccati方程，该方程给出了相关联的最小-最大优化问题的纳什均衡解。然后，开发了一个IRL算法来在线求解游戏代数Riccati方程，而无需完全了解系统动力学。所提出的基于IRL的算法实时在线求解每次迭代中的IRL Bellman方程，以评估OPFB策略，并使用评估后的策略提供的信息更新OPFB增益。自适应观测器用于在学习过程中为IRL Bellman方程提供完整状态的知识。但是，学习过程完成后不需要观察者。提供了一个仿真示例，以验证所提出算法对次优OPFB解决方案的收敛性以及所提出方法的性能。

著录项

来源
《International Journal of Adaptive Control and Signal Processing》 |2019年第2期|300-314|共15页
作者
Moghadam Rohollah; Lewis Frank L.;
展开▼
作者单位

Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Rolla, MO 65409 USA;

Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);美国《生物学医学文摘》(MEDLINE);
原文格式 PDF
正文语种 eng
中图分类
关键词
bounded L-2-gain; H-infinity controller; optimal control; output feedback; reinforcement learning (RL);

机译：有界L-2-增益;H-无限控制器;最优控制;输出反馈;强化学习（RL）;

相似文献

外文文献
中文文献
专利

1. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning [J] . Modares H., Lewis F.L. Automatic Control, IEEE Transactions on . 2014,第11期

机译：使用强化学习的部分未知连续时间系统的线性二次跟踪控制
2. Design of a H_∞ model reference tracking control for interconnected nonlinear systems by decentralized dynamic output-feedback [J] . Tlili Ali Sghaier Journal of the Franklin Institute . 2018,第17期

机译：分散动态输出反馈的非线性系统H_∞模型参考跟踪控制设计
3. Optimal H_∞-Based Linear-Quadratic Regulator Tracking Control for Discrete-Time Takagi-Sugeno Fuzzy Systems With Preview Actions [J] . Hui Zhang, Yang Shi, Bingxian Mu Journal of Dynamic Systems, Measurement, and Control . 2013,第4期

机译：具有预见作用的离散时间Takagi-Sugeno模糊系统的基于H_∞的最优线性二次调节器跟踪控制
4. Output-feedback Quadratic Tracking Control of Continuous-time Systems by Using Off-policy Reinforcement Learning with Neural Networks Observer [C] . Qingqing Meng, Yunjian Peng Chinese Control and Decision Conference . 2020

机译：基于神经网络观察者的非策略强化学习对连续时间系统的输出反馈二次跟踪控制
5. Linear Quadratic Tracking Based on Reinforcement Learning and Motor Speed Control Without System Dynamics [D] . Tang, Shuo. 2020

机译：基于钢筋学习的线性二次跟踪，电机速度控制，无系统动态
6. Fuzzy ... formula ... output-feedback control for the discrete-time system with channel fadings sector nonlinearities and randomly occurring interval delays and nonlinearities [O] . Xiaozheng Fan, Yan Wang, Manfeng Hu -1

机译：具有信道衰落扇区非线性以及随机出现的间隔延迟和非线性的离散时间系统的模糊...公式输出反馈控制
7. Fuzzy H ∞ $H_{infty}$ output-feedback control for the discrete-time system with channel fadings, sector nonlinearities, and randomly occurring interval delays and nonlinearities [O] . Xiaozheng Fan, Yan Wang, Manfeng Hu 2016

机译：模糊 H ∞ $ H _ { infty} $ 具有信道衰落，扇区非线性以及随机发生的间隔延迟和非线性的离散时间系统的输出反馈控制
8. Command Generator Tracker Synthesis Methods Using an LQG (Linear System Model, Quadratic Cost, and Gaussian Noise Process) Derived Proportional Plus Integral Controller Based on the Integral of the Regulation Error [R] . McMillian, J. P. 1983

机译：命令生成器跟踪器合成方法使用LQG（线性系统模型，二次成本和高斯噪声过程）派生比例加积分控制器基于调节误差的积分

Output-feedback H_∞ quadratic tracking control of linear systems using reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅