首页> 外文会议>International Conference on Systems, Man, and Cybernetics >Model-free optimal consensus control for multi-agent systems using kernel-based ADP method
【24h】

Model-free optimal consensus control for multi-agent systems using kernel-based ADP method

机译:基于内核的ADP方法的多智能体系统无模型最优共识控制

获取原文

摘要

Adaptive dynamic programming (ADP) is a prevalent way to solve the coupled Hamilton-Jacobi-Bellman (HJB) equations of the optimal consensus control for multi-agent systems (MAS). Neural networks (NNs) are normally used to approximate the value functions in ADP. However, NNs with manually designed features may influence the approximation ability. In this study, kernel-based methods which do not need to set the value function model structure in advance are adopted for value functions approximation. Moreover, to overcome the deficiency that most of the system dynamics are unknown, or the system is too complex to obtain the accurate dynamics. Local action value functions are defined, and kernel-based methods are used to approximate the local action value functions. Thus, an action dependent heuristic dynamic programming (ADHDP) approach using kernel-based local action value functions approximation is developed to achieve the optimal consensus control model-freely. The developed approach uses historical sample data to learn the system dynamics, and avoids the traditional system identification scheme. Simulation results are provided to demonstrate the effectiveness of the presented approach.
机译:自适应动态规划(ADP)是解决多智能体系统(MAS)最优共识控制的汉密尔顿-雅各比-贝尔曼(HJB)耦合方程的一种普遍方法。神经网络(NN)通常用于近似ADP中的值函数。但是,具有手动设计特征的NN可能会影响逼近能力。在这项研究中,采用了无需预先设置值函数模型结构的基于核的方法来进行值函数逼近。而且,为了克服大多数系统动力学未知或系统过于复杂而无法获得精确动力学的缺陷。定义了局部动作值函数,并且使用了基于内核的方法来近似局部动作值函数。因此,开发了一种基于动作的启发式动态规划(ADHDP)方法,该方法使用基于内核的局部动作值函数逼近以自由地实现最佳共识控制模型。所开发的方法使用历史样本数据来学习系统动力学,并且避免了传统的系统识别方案。提供仿真结果以证明所提出方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号