Off-policy reinforcement learning for distributed output synchronization of linear multi-agent systems

机译：线性多主体系统分布式输出同步的非策略强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, off-policy reinforcement learning (RL) is used to find a model-free optimal solution to the H_∞output synchronization of heterogeneous multi-agent discrete-time systems. First, the output synchronization problem is formulated as a set of local optimal tracking problems. It is shown that optimal local synchronization control protocols can be found by solving augmented game algebraic Riccati equations (GAREs). The solutions to the GAREs require the state of the leader for all agents and the knowledge of agent dynamics. To obviate this requirement, a distributed adaptive observer is designed to estimate the leader state for all agents without requiring complete knowledge of the leader dynamics. Moreover, off-policy RL algorithm is used to learn the solution to the GAREs using only measured data and without requiring the knowledge of the agent or the leader dynamics. In the proposed approach, in contrast to other model free approaches, the disturbance input does not need to be adjusted in a specific manner. A simulation example is given to show the effectiveness of the proposed method.

机译：本文采用非政策强化学习（RL）来找到针对H的无模型最优解 _{∞
异构多智能体离散时间系统的输出同步。首先，将输出同步问题表述为一组局部最优跟踪问题。结果表明，最佳局部同步控制协议可以通过求解增广的游戏代数Riccati方程（GARE）来找到。 GARE的解决方案要求所有座席的领导者状态和座席动态知识。为了消除此要求，设计了分布式自适应观察器来估计所有座席的领导者状态，而无需完全了解领导者动态。此外，不使用策略的RL算法仅使用测量数据即可学习GARE的解决方案，而无需了解代理或领导者动态。在提出的方法中，与其他无模型方法相比，不需要以特定方式调整干扰输入。仿真实例表明了该方法的有效性。}

著录项

来源
《IEEE Symposium Series on Computational Intelligence》|2017年|1-8|共8页
会议地点
作者
Bahare Kiumarsi; Frank L. Lewis;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Synchronization; Observers; Multi-agent systems; Heuristic algorithms; Learning (artificial intelligence); Mathematical model; Protocols;

机译：同步;观察员;多智能体系统;启发式算法;学习（人工智能）;数学模型;协议;

相似文献

外文文献
中文文献
专利

1. Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems [J] . Chen Ci, Lewis Frank L., Xie Kan, Automatica . 2020,第1期

机译：异构多代理系统自适应最优输出同步的禁止策略学习
2. Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning [J] . Modares Hamidreza, Nageshrao Subramanya P., Lopes Gabriel A. Delgado, Automatica . 2016,第Null期

机译：基于非策略强化学习的异构系统最优无模型输出同步
3. Off-policy integral reinforcement learning algorithm in dealing with nonzero sum game for nonlinear distributed parameter systems [J] . Ren He, Dai Jing, Zhang Huaguang, Transactions of the Institute of Measurement and Control . 2020,第15期

机译：非线性分布式参数系统非零和游戏处理非零综合加固学习算法
4. Off-policy reinforcement learning for distributed output synchronization of linear multi-agent systems [C] . Bahare Kiumarsi, Frank L. Lewis IEEE Symposium Series on Computational Intelligence . 2017

机译：线性多算机系统分布式输出同步的禁止策略加强学习
5. A study of interconnected dynamical systems and reinforcement learning in a multi-agent and distributed environment. [D] . Madera, Manuel. 2012

机译：在多主体和分布式环境中研究相互联系的动力系统和强化学习。
6. Sortation Control Using Multi-Agent Deep Reinforcement Learning in N-Grid Sortation System [O] . Ju-Bong Kim, Ho-Bin Choi, Gyu-Young Hwang, 2020

机译：N网格分类系统中使用多智能体深度强化学习的分类控制
7. A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning [O] . Wesley Suttle, Zhuoran Yang, Kaiqing Zhang, 2020

机译：用于分布式强化学习的多功能脱机演员 - 批评算法
8. Distributed Reinforcement Learning for Policy Synchronization in Infinite-Horizon Dec-POMDPs. [R] . Banerjee, B., Kraemer, L. 2012

机译：无限地平线Dec-pOmDp中策略同步的分布式强化学习。

Off-policy reinforcement learning for distributed output synchronization of linear multi-agent systems

摘要

著录项

相似文献

相关主题

期刊订阅