首页> 外文会议>American Control Conference >Toward Resilient Multi-Agent Actor-Critic Algorithms for Distributed Reinforcement Learning
【24h】

Toward Resilient Multi-Agent Actor-Critic Algorithms for Distributed Reinforcement Learning

机译:对分布式强化学习的弹性多功能演员 - 评论家批评算法

获取原文

摘要

This paper considers a distributed reinforcement learning problem in the presence of Byzantine agents. The system consists of a central coordinating authority called "master agent" and multiple computational entities called "worker agents". The master agent is assumed to be reliable, while, a small fraction of the workers can be Byzantine (malicious) adversaries. The workers are interested in cooperatively maximize a convex combination of the honest (non-malicious) worker agents' long-term returns through communication between the master agent and worker agents. A distributed actor-critic algorithm is studied which makes use of entry-wise trimmed mean. The algorithm's communication-efficiency is improved by allowing the worker agents to send only a scalar-valued variable to the master agent, instead of the entire parameter vector, at each iteration. The improved algorithm involves computing a trimmed mean over only the received scalar-valued variable. It is shown that both algorithms converge almost surely.
机译:本文认为在拜占庭试剂存在下分布式加固学习问题。该系统由一个名为“Master Agent”和称为“Worker代理”的多个计算实体的中央协调权限组成。假设硕士代理是可靠的,而少数工人可以是拜占庭(恶意)对手。工作人员对诚信(非恶意)工人的长期回报的长期回报的凸起组合有兴趣,通过硕士代理商和工人代理商之间的沟通。研究了分布式演员 - 批评算法,它利用进入明智的修剪平均值。通过允许工人代理仅向主代理发送标量词,而不是整个参数向量,可以在每个迭代中将算法的通信效率提高。改进的算法涉及仅计算所接收的标量值变量的修剪均值。结果表明,两种算法几乎肯定会聚。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号