首页> 外文会议>Australasian Joint Conference on Artificial Intelligence >Train Small, Deploy Big: Do Relative World Views Permit Swarm-Safety During Policy Transplantation for Multi-Agent Reinforcement Learning Problems?
【24h】

Train Small, Deploy Big: Do Relative World Views Permit Swarm-Safety During Policy Transplantation for Multi-Agent Reinforcement Learning Problems?

机译:火车小,部署大:亲戚世界观是否允许在策略移植期间允许群安全进行多功能增强学习问题?

获取原文

摘要

In order to 'train small, deploy big', agent control policies must be transplanted from one trained agent into a larger set of agents for deployment. Given that compute resources and training time generally scale with the number of agents, this approach to generating swarm control policies may be favourable for larger swarms. However, in order for this process to be successful, the agent control policy must be indistinct to the agent on which it is trained so that it can perform as required in its new host agent. Through extensive simulation of a cooperative multi-agent navigation task, it is shown that this indistinctness of agent policies, and therefore the success of the associated learned solution of the transplanted swarm, is dependent upon the way in which an agent views the world: absolute or relative. As a corollary to, and in contrary to naive intuition of, this result, we show that homogeneous agent capability is not enough to guarantee policy indistinctness. The article also discusses what general conditions may be required in order to enforce policy indistinctness.
机译:为了“培训小,部署大”,必须将代理控制策略从一个培训的代理移植到一个更大的代理程序进行部署。鉴于计算资源和培训时间通常随着代理的数量缩放,这种方法为更大的群体产生了群体控制策略。但是,为了使此过程成功,代理控制策略必须模糊到培训的代理,以便它可以根据其新主机代理中的要求执行。通过广泛的仿真合作多功能机会导航任务,表明代理商政策的模糊不清,因此取决于移植的群体的相关解答的成功取决于代理人认为世界的方式:绝对或相对。作为一种必然结果,并且违反了天真的直觉,这结果,我们表明均匀的药剂能力不足以保证政策模糊不清。本文还讨论了可能需要哪些一般条件,以执行政策模糊不清。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号