Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

R. Song; F. L. Lewis; Q. Wei; H. Zhang

首页> 外文期刊>Cybernetics, IEEE Transactions on >Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

【24h】

Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

机译：带有扰动的未知系统最优控制的非策略性Actor-Critical结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

An optimal control method is developed for unknown continuous-time systems with unknown disturbances in this paper. The integral reinforcement learning (IRL) algorithm is presented to obtain the iterative control. Off-policy learning is used to allow the dynamics to be completely unknown. Neural networks are used to construct critic and action networks. It is shown that if there are unknown disturbances, off-policy IRL may not converge or may be biased. For reducing the influence of unknown disturbances, a disturbances compensation controller is added. It is proven that the weight errors are uniformly ultimately bounded based on Lyapunov techniques. Convergence of the Hamiltonian function is also proven. The simulation study demonstrates the effectiveness of the proposed optimal control method for unknown systems with disturbances.

机译：本文针对扰动未知的未知连续时间系统，提出了一种最优控制方法。提出了积分强化学习（IRL）算法以获得迭代控制。非政策学习用于使动态完全未知。神经网络用于构建评论者和行动网络。结果表明，如果存在未知的干扰，则政策外的IRL可能不会收敛或有偏差。为了减少未知干扰的影响，增加了干扰补偿控制器。事实证明，基于李雅普诺夫技术，权重误差最终最终统一。哈密顿函数的收敛性也得到证明。仿真研究证明了所提出的最优控制方法对未知系统的有效性。

著录项

来源
《Cybernetics, IEEE Transactions on》 |2016年第5期|1041-1050|共10页
作者
R. Song; F. L. Lewis; Q. Wei; H. Zhang;
展开▼
作者单位

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Adaptive critic designs; adaptive/approximate dynamic programming (ADP); dynamic programming; off-policy; optimal control; unknown system;

机译：自适应批评家设计;自适应/近似动态规划（ADP）;动态规划;偏离策略;最优控制;未知系统;

相似文献

外文文献
中文文献
专利

1. Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning [J] . Zhang Zenglian, Song Ruizhuo, Cao Min Neurocomputing . 2019,第SEPa3期

机译：带有饱和执行器和未知动力学的非线性系统的同步最优控制方法
2. Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning [J] . Zhang Zenglian, Song Ruizhuo, Cao Min Neurocomputing . 2019,第Sepa3期

机译：饱和致动器饱和致动器的非线性系统的同步最优控制方法，使用脱策积分增强学习
3. Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning [J] . Hamidreza Modares, Frank L. Lewis, Zhong-Ping Jiang Cybernetics, IEEE Transactions on . 2016,第11期

机译：基于非策略强化学习的未知连续时间线性系统的最优输出反馈控制
4. Off-Policy Reinforcement Learning for Optimal Preview Tracking Control of Linear Discrete-Time systems with unknown dynamics [C] . Chao-Ran Wang, Huai-Ning Wu Chinese Automation Congress . 2018

机译：非策略强化学习，用于动态未知的线性离散时间系统的最优预知跟踪控制
5. Optimal tracking control of uncertain systems: On-policy and off-policy reinforcement learning approaches [D] . Modares, Hamidreza 2015

机译：不确定系统的最优跟踪控制：基于策略和基于策略的强化学习方法
6. Optimal Fractional-Order Active Disturbance Rejection Controller Design for PMSM Speed Servo System [O] . Pengchong Chen, Ying Luo, Yibing Peng, 2021

机译：PMSM速度伺服系统的最佳分数阶主动干扰抑制控制器设计
7. Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning [O] . Zhenfei Xiao, Jinna Li, Ping Li 2020

机译：输出反馈H∞控制线性离散时间多人多人系统，使用脱离策略Q-Learning具有多源干扰
8. Predictive Feedback and Feedforward Control for Systems with Unknown Disturbances [R] . Juang, Jer-Nan, Eure, Kenneth W. 1998

机译：具有未知扰动的系统的预测反馈和前馈控制

Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

摘要

著录项

相似文献

相关主题

期刊订阅