首页> 外文会议>International conference on neural information processing >Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games
【24h】

Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

机译:部分未知的非零和游戏的非策略强化学习

获取原文

摘要

In this paper, the optimal control problem of nonzero-sum (NZS) games with partially unknown dynamics is investigated. The off-policy reinforcement learning (RL) method is proposed to approximate the solution of the coupled Hamilton-Jacobi (HJ) equations. A single critic network structure for each player is constructed using neural network (NN) technique. To improve the applicability of the off-policy RL method, the tuning laws of critic weights are designed based on the offline learning and online learning methods, respectively. The simulation study demonstrates the effectiveness of the proposed algorithms.
机译:本文研究了动力学部分未知的非零和(NZS)游戏的最优控制问题。提出了非策略强化学习(RL)方法,以近似求解耦合的Hamilton-Jacobi(HJ)方程的解。使用神经网络(NN)技术为每个玩家构建单个评论者网络结构。为了提高离线策略RL方法的适用性,分别基于离线学习和在线学习方法设计了评论者权重的调整规律。仿真研究证明了所提出算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号