首页> 外文期刊>Neurocomputing >Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
【24h】

Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

机译:基于神经网络的学习算法,通过自适应动态规划,用于具有控制约束的离散多人系统的合作游戏

获取原文
获取原文并翻译 | 示例

摘要

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multiplayer systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness. (C) 2019 Elsevier B.V. All rights reserved.
机译:自适应动态规划(ADP)是强化学习的重要分支,是解决各种最佳控制问题的有力工具。但是,在该领域中很少研究具有控制约束的离散时间多人系统的协作游戏问题。为了解决这个问题,本文提出了一种基于ADP技术的新型策略迭代算法,并对其相关的收敛性分析进行了研究。针对所提出的PI算法,提出了一种具有多网络结构的在线神经网络的实现方案。在基于NN的在线学习算法中,分别使用评论家网络,约束角色网络和无约束角色网络来近似值函数,约束和无约束控制策略,并基于梯度下降法设计了NN权重更新规律。最后,通过一个数值仿真例子说明了该方法的有效性。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号