Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

Kyriakos G. Vamvoudakis; Frank L. Lewis

首页> 外文期刊>Automatica >Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

【24h】

Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

机译：多人非零和游戏：汉密尔顿-雅各比方程组的在线自适应学习解决方案

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we present an online adaptive control algorithm based on policy iteration reinforcement learning techniques to solve the continuous-time (CT) multi player non-zero-sum (NZS) game with infinite horizon for linear and nonlinear systems. NZS games allow for players to have a cooperative team component and an individual selfish component of strategy. The adaptive algorithm learns online the solution of coupled Riccati equations and coupled Hamilton-Jacobi equations for linear and nonlinear systems respectively. This adaptive control method finds in real-time approximations of the optimal value and the NZS Nash-equilibrium, while also guaranteeing closed-loop stability. The optimal-adaptive algorithm is implemented as a separate actor/critic parametric network approximator structure for every player, and involves simultaneous continuous-time adaptation of the actor/critic networks. A persistence of excitation condition is shown to guarantee convergence of every critic to the actual optimal value function for that player. A detailed mathematical analysis is done for 2-player NZS games. Novel tuning algorithms are given for the actor/critic networks. The convergence to the Nash equilibrium is proven and stability of the system is also guaranteed. This provides optimal adaptive control solutions for both nonzero-sum games and their special case, the zero-sum games. Simulation examples show the effectiveness of the new algorithm.

机译：在本文中，我们提出了一种基于策略迭代强化学习技术的在线自适应控制算法，用于求解线性和非线性系统的无限时间连续时间（CT）多玩家非零和（NZS）游戏。 NZS游戏允许玩家拥有合作团队成分和策略的个人自私成分。自适应算法在线学习线性和非线性系统的耦合Riccati方程和耦合Hamilton-Jacobi方程的解。这种自适应控制方法可以实时找到最佳值和NZS Nash平衡，同时还可以确保闭环稳定性。最佳自适应算法被实现为针对每个玩家的单独的参与者/批评者参数网络逼近器结构，并且涉及参与者/批评者网络的同时连续时间适配。显示了激励条件的持久性，以确保每个评论家都可以收敛到该玩家的实际最佳价值函数。对2人游戏的NZS游戏进行了详细的数学分析。针对演员/评论网络给出了新颖的调优算法。证明了纳什均衡的收敛性，并且还保证了系统的稳定性。这为非零和游戏及其特殊情况（零和游戏）提供了最佳的自适应控制解决方案。仿真实例表明了该算法的有效性。

著录项

来源
《Automatica 》 |2011年第8期| 共14页
作者
Kyriakos G. Vamvoudakis; Frank L. Lewis;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类 TP1TP2;
关键词
Multi-player games; Nash equilibrium; Coupled Hamilton-Jacobi equations; Coupled Riccati equations; Adaptive optimal control; Persistence of excitation;

机译：多人游戏;纳什均衡;汉密尔顿-雅各比方程组;里卡蒂方程组;自适应最优控制;激励持久性;

相似文献

外文文献
中文文献
专利

1. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations [J] . Kyriakos G. Vamvoudakis, Frank L. Lewis Automatica . 2011 ,第8期

机译：多人非零和游戏：汉密尔顿-雅各比方程组的在线自适应学习解决方案
2. Integral reinforcement learning-based online adaptive event-triggered control for non-zero-sum games of partially unknown nonlinear systems [J] . Su Hanguang, Zhang Huaguang, Sun Shaoxin, Neurocomputing . 2020 ,第Feba15期

机译：基于整体强化学习的部分未知非线性系统非零和博弈在线自适应事件触发控制
3. Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality [J] . Kyriakos G. Vamvoudakis, Frank L. Lewis, Greg R. Hudas Automatica . 2012 ,第8期

机译：多主体差分图形游戏：在线自适应学习解决方案，实现最佳同步
4. Non-zero sum games: Online learning solution of coupled Hamilton-Jacobi and coupled Riccati equations [C] . Vamvoudakis Kyriakos G., Lewis Frank L. 2011 IEEE International Symposium on Intelligent Control . 2011

机译：非零和游戏：耦合的Hamilton-Jacobi和耦合的Riccati方程的在线学习解决方案
5. Applications of Hamilton-Jacobi equations to homogenization, optimal control and differential games [D] . Takei, Ryo 2011

机译：Hamilton-Jacobi方程在均匀化，最优控制和差异游戏中的应用
6. Water Wave Solutions of the Coupled System Zakharov-Kuznetsov and Generalized Coupled KdV Equations [O] . A. R. Seadawy, K. El-Rashidy -1

机译：耦合系统Zakharov-Kuznetsov的水波解和广义耦合KdV方程
7. Online Adaptive Learning Solution of Multi-Agent Differential Graphical Games [O] . Kyriakos G., Frank L. 2012

机译：多代理差分图形游戏的在线自适应学习解决方案
8. Max-Min Representations and Product Formulas for the Viscosity Solutions of Hamilton-Jacobi Equations with Applications to Differential Games [R] . Souganidis, P. E. 1985

机译：Hamilton-Jacobi方程粘性解的max-min表示和乘积公式及其在微分对策中的应用

Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations

摘要

著录项

相似文献

相关主题

期刊订阅