Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

机译：部分未知的非零和游戏的非策略强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, the optimal control problem of nonzero-sum (NZS) games with partially unknown dynamics is investigated. The off-policy reinforcement learning (RL) method is proposed to approximate the solution of the coupled Hamilton-Jacobi (HJ) equations. A single critic network structure for each player is constructed using neural network (NN) technique. To improve the applicability of the off-policy RL method, the tuning laws of critic weights are designed based on the offline learning and online learning methods, respectively. The simulation study demonstrates the effectiveness of the proposed algorithms.

机译：本文研究了动力学部分未知的非零和（NZS）游戏的最优控制问题。提出了非策略强化学习（RL）方法，以近似求解耦合的Hamilton-Jacobi（HJ）方程的解。使用神经网络（NN）技术为每个玩家构建单个评论者网络结构。为了提高离线策略RL方法的适用性，分别基于离线学习和在线学习方法设计了评论者权重的调整规律。仿真研究证明了所提出算法的有效性。

著录项

来源
《International conference on neural information processing》|2017年|822-830|共9页
会议地点
作者
Qichao Zhang; Dongbin Zhao; Sibo Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Internal reinforcement learning; Nonzero-sum games; Optimal control; Partially unknown dynamics; Offline and online learning;

机译：内部强化学习;非零和游戏;最佳控制;局部未知的动力学;离线和在线学习;

相似文献

外文文献
中文文献
专利

1. Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning [J] . Yang Yongliang, Zhang Sen, Dong Jie, Quality Control, Transactions . 2020,第期

机译：利用禁止策略强化学习的离散时间系统的数据驱动非零游戏
2. Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator [J] . Ren He, Zhang Huaguang, Wen Yinlei, Neurocomputing . 2019,第MARa28期

机译：求解带有饱和执行器的非线性多人非零和博弈的积分强化学习非策略方法
3. Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games [J] . Ruizhuo Song, Frank L. Lewis, Qinglai Wei Neural Networks and Learning Systems, IEEE Transactions on . 2017,第3期

机译：解决非线性连续时间多人非零和游戏的非策略整体强化学习方法
4. Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games [C] . Qichao Zhang, Dongbin Zhao, Sibo Zhang International Conference on Neural Information Processing . 2017

机译：部分未知非零和游戏的禁止促进学习
5. Off-Policy Evaluation of Reinforcement Learning in Healthcare [D] . Gottesman, Omer. 2020

机译：医疗保健强度学习的违规政策评估
6. Multi-agent reinforcement learning with approximate model learning for competitive games [O] . Young Joon Park, Yoon Sang Cho, Seoung Bum Kim 2012

机译：多主体强化学习和近似模型学习的竞技游戏
7. Integral Reinforcement Learning for Finding Online the Feedback Nash Equilibrium of Nonzero-Sum Differential Games [O] . Draguna Vrabie, Frank L. Lewis 2015

机译：在线求解非零和微分对策的反馈纳什均衡的积分强化学习

Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

摘要

著录项

相似文献

相关主题

期刊订阅