Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

Yang Q.; Jagannathan S.

首页> 外文期刊>Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on >Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

【24h】

Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

机译：使用在线近似器的仿射非线性离散系统的强化学习控制器设计

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, reinforcement learning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown nonlinear discretetime systems in the presence of bounded disturbances. The proposed controller design has two entities, an action network that is designed to produce optimal signal and a critic network that evaluates the performance of the action network. The critic estimates the cost-to-go function which is tuned online using recursive equations derived from heuristic dynamic programming. Here, neural networks (NNs) are used both for the action and critic whereas any OLAs, such as radial basis functions, splines, fuzzy logic, etc., can be utilized. For the output-feedback counterpart, an additional NN is designated as the observer to estimate the unavailable system states, and thus, separation principle is not required. The NN weight tuning laws for the controller schemes are also derived while ensuring uniform ultimate boundedness of the closed-loop system using Lyapunov theory. Finally, the effectiveness of the two controllers is tested in simulation on a pendulum balancing system and a two-link robotic arm system.

机译：在本文中，针对存在有界扰动的通用多输入多输出仿射未知非线性离散时间系统，使用在线逼近器（OLA）提出了基于强化学习状态和输出反馈的自适应批评家控制器设计。所提出的控制器设计具有两个实体，一个用于产生最佳信号的动作网络和一个评估该动作网络性能的评论器网络。评论家估计了成本函数，该函数使用启发式动态规划派生的递归方程式在线调整。此处，神经网络（NN）既用于动作也用于批评家，而任何OLA（例如径向基函数，样条曲线，模糊逻辑等）都可以使用。对于输出反馈对应项，附加的NN被指定为观察者以估计不可用的系统状态，因此不需要分离原理。还使用Lyapunov理论推导了控制器方案的NN权重调整定律，同时确保闭环系统的一致最终有界性。最后，在摆平衡系统和双连杆机械臂系统的仿真中测试了这两个控制器的有效性。

著录项

来源
《Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on》 |2012年第2期|p.377-390|共14页
作者
Yang Q.; Jagannathan S.;
展开▼
作者单位

State Key Laboratory of Industrial Control Technology, Department of Control Science and Engineering, Zhejiang University, Hangzhou, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Adaptive critic; Lyapunov method; dynamic programming (DP); neural networks (NNs); online approximators (OLAs); online learning; reinforcement learning;

机译：自适应批评家;李雅普诺夫方法;动态规划（DP）;神经网络（NNs）;在线逼近器（OLA）;在线学习;强化学习;

相似文献

外文文献
中文文献
专利

1. Reinforcement learning-based online adaptive controller design for a class of unknown nonlinear discrete-time systems with time delays [J] . Liang Yuling, Zhang Huaguang, Xiao Geyang, Neural computing & applications . 2018,第6期

机译：基于延迟的一类未知非线性离散时间系统的加固基于学习的在线自适应控制器设计
2. Fault-Tolerant Controller Design for a Class of Nonlinear MIMO Discrete-Time Systems via Online Reinforcement Learning Algorithm [J] . Z. Wang, L. Liu, H. Zhang, IEEE Transactions on Systems, Man, and Cybernetics . 2016,第5期

机译：基于在线强化学习算法的一类非线性MIMO离散时间系统的容错控制器设计
3. Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning [J] . Xiong Yang, Derong Liu, Ding Wang, Neural Networks: The Official Journal of the International Neural Network Society . 2014,第Null期

机译：使用强化学习的一类未知非仿射非线性系统的离散时间在线学习控制
4. Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems [C] . Yang, Qinmin, Jagannathan, . 2007

机译：基于在线强化学习的仿射非线性离散系统神经网络控制器设计
5. Data-Based Reinforcement Learning: Approximate Optimal Control for Uncertain Nonlinear Systems [D] . ?Deptu?a, Patryk 2019

机译：基于数据的强化学习：不确定非线性系统的近似最优控制
6. Design of an Optimal Preview Controller for Linear Discrete-Time Descriptor Noncausal Multirate Systems [O] . Mengjuan Cao, Fucheng Liao -1

机译：线性离散时间非因果多速率系统的最优预览控制器设计
7. Reinforcement Learning Neural-Network-Based Controller for Nonlinear Discrete-Time Systems With Input Constraints [O] . Pingan He, S. Jagannathan, Senior Member 2013

机译：具有输入约束的非线性离散系统的基于强化学习神经网络的控制器

Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅