Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems

Ren He; Dai Jing; Zhang Huaguang; Zhang Kun

首页> 外文期刊>Transactions of the Institute of Measurement and Control >Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems

【24h】

Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems

机译：非线性分布式参数系统非零和游戏处理非零综合加固学习算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Benefitting from the technology of integral reinforcement learning, the nonzero sum (NZS) game for distributed parameter systems is effectively solved in this paper when the information of system dynamics are unavailable. The Karhunen-Loeve decomposition (KLD) is employed to convert the partial differential equation (PDE) systems into high-order ordinary differential equation (ODE) systems. Moreover, the off-policy IRL technology is introduced to design the optimal strategies for the NZS game. To confirm that the presented algorithm will converge to the optimal value functions, the traditional adaptive dynamic programming (ADP) method is first discussed. Then, the equivalence between the traditional ADP method and the presented off-policy method is proved. For implementing the presented off-policy IRL method, actor and critic neural networks are utilized to approach the value functions and control strategies in the iteration process, individually. Finally, a numerical simulation is shown to illustrate the effectiveness of the proposal off-policy algorithm.

机译：从整体增强学习技术中受益，当系统动态的信息不可用时，在本文中有效地解决了分布式参数系统的非零（NZS）游戏。采用Karhunen-Loeve分解（KLD）将部分微分方程（PDE）系统转换为高阶常微分方程（ODE）系统。此外，介绍了禁止的IRL技术，以设计NZS游戏的最佳策略。为了确认所呈现的算法将收敛到最佳值函数，首先讨论传统的自适应动态编程（ADP）方法。然后，证明了传统ADP方法与所呈现的禁止策略方法之间的等价。为了实现所呈现的off-policy irl方法，参与者和批评者的神经网络被利用来分别接近迭代过程中的价值函数和控制策略。最后，示出了数值模拟来说明提案脱核算法的有效性。

著录项

来源
《Transactions of the Institute of Measurement and Control》 |2020年第15期|共10页
作者
Ren He; Dai Jing; Zhang Huaguang; Zhang Kun;
展开▼
作者单位

Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang Liaoning Peoples R China;

Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China;

Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang Liaoning Peoples R China;

Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang Liaoning Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类仪器、仪表;
关键词
Integral reinforcement learning; off-policy algorithm; nonzero sum game; distributed parameter systems; adaptive dynamic programming;

机译：整体强化学习;脱助验证算法;非零和游戏;分布式参数系统;自适应动态编程;

相似文献

外文文献
中文文献
专利

1. Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems [J] . Ren He, Dai Jing, Zhang Huaguang, Transactions of the Institute of Measurement and Control . 2020,第15期

机译：非线性分布式参数系统非零和游戏处理非零综合加固学习算法
2. Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games [J] . Ruizhuo Song, Frank L. Lewis, Qinglai Wei Neural Networks and Learning Systems, IEEE Transactions on . 2017,第3期

机译：解决非线性连续时间多人非零和游戏的非策略整体强化学习方法
3. Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning [J] . Yang Yongliang, Zhang Sen, Dong Jie, Quality Control, Transactions . 2020,第期

机译：利用禁止策略强化学习的离散时间系统的数据驱动非零游戏
4. Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games [C] . Qichao Zhang, Dongbin Zhao, Sibo Zhang International conference on neural information processing . 2017

机译：部分未知的非零和游戏的非策略强化学习
5. Optimal tracking control of uncertain systems: On-policy and off-policy reinforcement learning approaches [D] . Modares, Hamidreza 2015

机译：不确定系统的最优跟踪控制：基于策略和基于策略的强化学习方法
6. The Algorithms of Distributed Learning and Distributed Estimation about Intelligent Wireless Sensor Network [O] . Fuxiao Tan 2020

机译：智能无线传感器网络的分布式学习和分布式估计算法
7. Erratum to: Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems [O] . Sholeh Yasini, Mohammad Bagher Naghibi Sistani, Ali Karimpour 2015

机译：错误到：用于多智能经纪非零 - 输入系统的多代理非极化游戏的强化学习和神经网络

Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems

摘要

著录项

相似文献

相关主题

期刊订阅