首页> 外文期刊>IEEE Transactions on Systems, Man, and Cybernetics >Robust Optimal Control for Disturbed Nonlinear Zero-Sum Differential Games Based on Single NN and Least Squares
【24h】

Robust Optimal Control for Disturbed Nonlinear Zero-Sum Differential Games Based on Single NN and Least Squares

机译:基于单个NN和最小二乘的干扰非线性零和差动游戏的鲁棒最优控制

获取原文
获取原文并翻译 | 示例

摘要

This paper establishes an approximate optimal critic learning algorithm based on single neural network (NN) policy iteration (PI) aiming at solving for continuous-time (CT) 2-player zero-sum games (ZSGs). In fact, we have to face the problem that the errors will disturb the dynamics and in turn identifying dynamics will generate errors. In order to prevent the effect of errors, in this paper, a single NN-based online PI algorithm is developed for the CT system, which is disturbed nonlinear ZSG. With plenty of online data, the Hamilton–Jacobi–Isaacs equation can be solved without complete dynamics. Then by the least-squares method, we can obtain the NN weights. Moreover, in the process of dealing with the undisturbed system, we find the way that obtains NN weights in this paper is equal to the way that obtains the optimal solution by the Gauss–Newton method. Based on the convergence of the Gauss–Newton method, we can efficiently obtain the optimal controller for the undisturbed system by utilizing online data. After getting the controller of the undisturbed system, it is time to take disturbance into consideration, so that we design a robust control pair to overcome the disturbance. In order to demonstrate the effectiveness of this algorithm, we design a set of simulations. The results verify that we can solve the disturbed nonlinear ZSG by this algorithm.
机译:本文建立了基于单个神经网络(NN)策略迭代(PI)的近似最优批评算法,旨在解决连续时间(CT)2 - 玩家零和游戏(ZSG)。事实上,我们必须面对错误,错误会使动态扰乱动态,而识别动态将产生错误。为了防止误差的影响,本文为CT系统开发了一种基于NN的在线PI算法,这是受扰动的非线性ZSG。通过大量的在线数据,汉密尔顿 - 雅各比-ISAACS方程可以在没有完整动态的情况下解决。然后通过最小二乘法,我们可以获得NN权重。此外,在处理未受干扰的系统的过程中,我们发现在本文中获得NN权重的方式等于通过Gauss-Newton方法获得最佳解决方案的方式。基于Gauss-Newton方法的收敛,我们可以通过利用在线数据有效地获得未受干扰的系统的最佳控制器。在获取未受干扰的系统的控制器后,是时候考虑干扰了,因此我们设计了一个强大的控制对来克服干扰。为了展示该算法的有效性,我们设计了一组模拟。结果验证了我们可以通过该算法解决干扰的非线性ZSG。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号