首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning
【24h】

Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning

机译:使用强化学习的一类未知非仿射非线性系统的离散时间在线学习控制

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, a reinforcement-learning-based direct adaptive control is developed to deliver a desired tracking performance for a class of discrete-time (DT) nonlinear systems with unknown bounded disturbances. We investigate multi-input-multi-output unknown nonaffine nonlinear DT systems and employ two neural networks (NNs). By using Implicit Function Theorem, an action NN is used to generate the control signal and it is also designed to cancel the nonlinearity of unknown DT systems, for purpose of utilizing feedback linearization methods. On the other hand, a critic NN is applied to estimate the cost function, which satisfies the recursive equations derived from heuristic dynamic programming. The weights of both the action NN and the critic NN are directly updated online instead of offline training. By utilizing Lyapunov's direct method, the closed-loop tracking errors and the NN estimated weights are demonstrated to be uniformly ultimately bounded. Two numerical examples are provided to show the effectiveness of the present approach.
机译:在本文中,基于增强学习的直接自适应控制被开发来为具有未知边界扰动的一类离散时间(DT)非线性系统提供理想的跟踪性能。我们研究了多输入多输出未知的非仿射非线性DT系统,并采用了两个神经网络(NNs)。通过使用隐函数定理,动作NN用于生成控制信号,并且还设计为消除未知DT系统的非线性,以利用反馈线性化方法。另一方面,将注释器NN用于估计成本函数,该函数满足从启发式动态规划派生的递归方程。动作NN和评论者NN的权重都直接在线更新,而不是离线训练。利用李雅普诺夫的直接方法,证明了闭环跟踪误差和神经网络估计权重最终是一致的。提供了两个数值示例,以显示本方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号