首页> 外文会议>IEEE International inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support >A collaborative distributed multi-agent reinforcement learning technique for dynamic agent shortest path planning via selected sub-goals in complex cluttered environments
【24h】

A collaborative distributed multi-agent reinforcement learning technique for dynamic agent shortest path planning via selected sub-goals in complex cluttered environments

机译:一种通过选定的杂乱环境中所选子目标的动态代理最短路径规划的协同分布式多智能体增强技术

获取原文

摘要

Collaborative monitoring of large infrastructures, such as military, transportation and maritime systems are decisive issues in many surveillance, protection, and security applications. In many of these applications, dynamic multi-agent systems using reinforcement learning for agents' autonomous path planning, where agents could be moving randomly to reach their respective goals and avoiding topographical obstacles intelligently, becomes a challenging problem. This is specially so in a dynamic agent environment. In our prior work we presented an intelligent multi-agent hybrid reactive and reinforcement learning technique for collaborative autonomous agent path planning for monitoring Critical Key Infrastructures and Resources (CKIR) in a geographically and a computationally distributed systems. Here agent monitoring of large environments is reduced to monitoring of relatively smaller track-able geographically distributed agent environment regions. In this paper we tackle this problem in the challenging case of complex and cluttered environments, where agents' initial random-walk paths become challenging and relatively nonconverging. Here we propose a multi-agent distributed hybrid reactive re-enforcement learning technique based on selected agent intermediary sub-goals using a learning reward scheme in a distributed-computing memory setting. Various case study scenarios are presented for convergence study to the shortest minimum-amount-of-time exploratory steps for faster and efficient agent learning. In this work the distributed dynamic agent communication is done via a Message Passing Interface (MPI).
机译:大型基础设施的协作监测,如军事,运输和海上系统是许多监视,保护和安全应用中的决定性问题。在许多这类应用中,采用强化学习代理商的自主路径规划,其中代理可以随机移动以达到他们各自的目标,避免智能地形障碍物动态多代理系统,成为一个具有挑战性的问题。这在动态代理环境中是特别的。在我们以前的工作中,我们提出了协作式自主代理路径规划的智能多剂混合反应和强化学习技术在地理和计算分布式系统监控关键重点基础设施和资源(CKIR)。这里,对大型环境的代理监测到监视相对较小的跟踪的地理分布式代理环境区域。在本文中,我们在复杂和杂乱环境的具有挑战性的情况下解决这个问题,其中代理的初始随机步行路径变得挑战,并且相对不可逆转。在这里,我们在分布式计算存储器设置中使用学习奖励方案提出了一种基于所选代理中间子目标的多代理分布式混合反应重新实施学习技术。提供各种案例研究方案,用于收敛研究,以更快,高效的代理学习的最短最短探索步骤。在此工作中,通过消息传递接口(MPI)完成分布式动态代理通信。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号