首页> 外文期刊>Automatica >Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems
【24h】

Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems

机译:异构多代理系统自适应最优输出同步的禁止策略学习

获取原文
获取原文并翻译 | 示例
       

摘要

This paper proposes an off-policy learning-based dynamic state feedback protocol that achieves the optimal synchronization of heterogeneous multi-agent systems (MAS) over a directed communication network. Note that most of the recent works on heterogeneous MAS are not formed in an optimal manner. By formulating the cooperative output regulation problem as an H-infinity optimization problem, we can use reinforcement learning to find output synchronization protocols online along with the system trajectories without solving output regulator equations. In contrast to the existing optimal literature where leader's states are assumed to be globally or distributively available for the communication, we only allow the relative system outputs to transmit through the network; namely, no leader's states are needed now for the control or learning purpose. (C) 2020 Elsevier Ltd. All rights reserved.
机译:本文提出了一种基于促进基于策略的动态状态反馈协议,实现了在定向通信网络上的异构多代理系统(MAS)的最佳同步。 请注意,最近的大多数在异构MAS上的工作不是以最佳方式形成的。 通过将协作输出调节问题作为H-Infinity优化问题,我们可以使用强化学习在线查找输出同步协议以及系统轨迹而不解决输出调节器方程。 与现有的最佳文学相比,假设领导者的国家被全局或分布式可用于通信,我们只允许相对系统输出通过网络传输; 即,现在不需要领导者的国家来控制或学习目的。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号