H∞ optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach

机译：未知线性离散时间系统的H ∞最优控制：一种非政策强化学习方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a model-free H control design for linear discrete-time systems using reinforcement learning (RL). A novel off-policy RL algorithm is used to solve the game algebraic Riccati equation (GARE) online using the measured data along the system trajectories. The proposed RL algorithm has the following advantages compared to existing model-free RL methods for solving H control problem: 1) It is data efficient and fast since a stream of experiences which is obtained from executing a fixed behavioral policy is reused to update many value functions correspond to different leaning policies sequentially. 2) The disturbance input does not need to be adjusted in a specific manner. 3) There is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation conditions. A simulation example is used to verify the effectiveness of the proposed control scheme.

机译：本文提出了一种使用强化学习（RL）的线性离散时间系统的无模型H控制设计。一种新颖的非策略RL算法用于使用沿着系统轨迹的测量数据在线求解游戏代数Riccati方程（GARE）。与现有的解决H控制问题的无模型RL方法相比，所提出的RL算法具有以下优点：1）由于执行固定行为策略所获得的经验流可重复使用以更新许多价值，因此它具有数据高效且快速的特点。功能依次对应于不同的学习策略。 2）不需要以特定方式调整干扰输入。 3）由于向控制输入端添加了探测噪声以保持激励条件的持久性，因此没有偏差。仿真实例用于验证所提出的控制方案的有效性。

著录项

来源
《IEEE International Conference on Cybernetics and Intelligent Systems;IEEE International Conference on Robotics, Automation and Mechatronics》|2015年|41-46|共6页
会议地点
作者
Kiumarsi Bahare; Modares Hamidreza; Lewis Frank L.; Jiang Zhong-Ping;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
H; control; game algebraic Riccati equation; off-policy; reinforcement learning;

机译：H;控制;游戏代数Riccati方程;偏离策略;强化学习;

相似文献

外文文献
中文文献
专利

1. Optimal tracking control for non-zero-sum games of linear discrete-time systems via off-policy reinforcement learning [J] . Optimal Control Applications and Methods . 2020,第4期

机译：通过截止策略强化学习对线性离散时间系统非零和游戏的最佳跟踪控制
2. Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning [J] . Zhang Zenglian, Song Ruizhuo, Cao Min Neurocomputing . 2019,第SEPa3期

机译：带有饱和执行器和未知动力学的非线性系统的同步最优控制方法
3. Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning [J] . Zhang Zenglian, Song Ruizhuo, Cao Min Neurocomputing . 2019,第Sepa3期

机译：饱和致动器饱和致动器的非线性系统的同步最优控制方法，使用脱策积分增强学习
4. H∞ optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach [C] . Kiumarsi Bahare, Modares Hamidreza, Lewis Frank L., IEEE International Conference on Cybernetics and Intelligent Systems . 2015

机译：H ∞未知线性离散时间系统的最佳控制：脱促策略强化学习方法
5. Optimal tracking control of uncertain systems: On-policy and off-policy reinforcement learning approaches [D] . Modares, Hamidreza 2015

机译：不确定系统的最优跟踪控制：基于策略和基于策略的强化学习方法
6. Controlled Biomineralization of Magnetite (Fe(inf3)O(inf4)) and Greigite (Fe(inf3)S(inf4)) in a Magnetotactic Bacterium [O] . D. A. Bazylinski, R. B. Frankel, B. R. Heywood, 1995

机译：趋磁细菌中磁铁矿（Fe（inf3）O（inf4））和钙铁矿（Fe（inf3）S（inf4））的受控生物矿化
7. Finite-horizon optimal control of discrete-time linear systems with completely unknown dynamics using Q-learning [O] . Jingang Zhao, Chi Zhang 2017

机译：使用Q-Learning完全未知动态的离散时间线性系统的有限视线最优控制
8. State-Space Approach to Discrete-Time Super-Optimal H Sup Inf Control Problems [R] . Postlethwaite, I., Tsai, M. C., Gu, D. 1987

机译：离散时间超最优H sup Inf控制问题的状态空间方法

H∞ optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach

摘要

著录项

相似文献

相关主题

期刊订阅