首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Online Solution of Two-Player Zero-Sum Games for Continuous-Time Nonlinear Systems With Completely Unknown Dynamics
【24h】

Online Solution of Two-Player Zero-Sum Games for Continuous-Time Nonlinear Systems With Completely Unknown Dynamics

机译:具有未知动态的连续时间非线性系统的两人零和游戏在线解决方案

获取原文
获取原文并翻译 | 示例

摘要

Regarding two-player zero-sum games of continuous-time nonlinear systems with completely unknown dynamics, this paper presents an online adaptive algorithm for learning the Nash equilibrium solution, i.e., the optimal policy pair. First, for known systems, the simultaneous policy updating algorithm (SPUA) is reviewed. A new analytical method to prove the convergence is presented. Then, based on the SPUA, without using a priori knowledge of any system dynamics, an online algorithm is proposed to simultaneously learn in real time either the minimal nonnegative solution of the Hamilton–Jacobi–Isaacs (HJI) equation or the generalized algebraic Riccati equation for linear systems as a special case, along with the optimal policy pair. The approximate solution to the HJI equation and the admissible policy pair is reexpressed by the approximation theorem. The unknown constants or weights of each are identified simultaneously by resorting to the recursive least square method. The convergence of the online algorithm to the optimal solutions is provided. A practical online algorithm is also developed. Simulation results illustrate the effectiveness of the proposed method.
机译:对于完全未知动力学的连续时间非线性系统的两人零和博弈,本文提出了一种在线自适应算法,用于学习纳什均衡解,即最优策略对。首先,对于已知系统,回顾了同步策略更新算法(SPUA)。提出了一种新的证明收敛性的分析方法。然后,基于SPUA,在不使用任何系统动力学先验知识的情况下,提出了一种在线算法,以同时实时学习Hamilton–Jacobi–Isaacs(HJI)方程的最小非负解或广义代数Riccati方程线性系统的特殊情况,以及最佳策略对。 HJI方程和可允许策略对的近似解由近似定理重新表达。借助于递归最小二乘方法,可以同时识别每个变量的未知常数或权重。提供了在线算法与最优解的收敛性。还开发了一种实用的在线算法。仿真结果说明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号