首页> 外文会议>International Conference on Automated Planning and Scheduling >Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty
【24h】

Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty

机译:基于区域的不确定性区域的多态路径研究

获取原文

摘要

We address the problem of multiple agents finding their paths from respective sources to destination nodes in a graph (also called MAPF). Most existing approaches assume that all agents move at fixed speed, and that a single node accommodates only a single agent. Motivated by the emerging applications of autonomous vehicles such as drone traffic management, we present zone-based path finding (or ZBPF) where agents move among zones, and agents' movements require uncertain travel time. Furthermore, each zone can accommodate multiple agents (as per its capacity). We also develop a simulator for ZBPF which provides a clean interface from the simulation environment to learning algorithms. We develop a novel formulation of the ZBPF problem using difference-of-convex functions (DC) programming. The resulting approach can be used for policy learning using samples from the simulator. We also present a multiagent credit assignment scheme that helps our learning approach converge faster. Empirical results in a number of 2D and 3D instances show that our approach can effectively minimize congestion in zones, while ensuring agents reach their final destinations.
机译:我们解决了多个代理在图表中从各自的来源找到其路径的问题(也称为MAPF)。大多数现有方法假设所有代理以固定速度移动,并且单个节点仅适用于单个代理。由无人机交通管理等自治车辆的新出现应用,我们呈现基于区域的路径查找(或ZBPF),其中代理在区域之间移动,代理运动需要不确定的旅行时间。此外,每个区域可以容纳多个代理(根据其容量)。我们还为ZBPF开发了一个模拟器,它提供了从仿真环境到学习算法的干净接口。我们使用凸函数(DC)编程开发了对ZBPF问题的新颖制定。由此产生的方法可用于使用模拟器的样本来学习策略学习。我们还提出了一种多读信用分配方案,可以帮助我们的学习方法更快地收敛。在许多2D和3D实例中的经验结果表明,我们的方法可以有效地减少区域中的拥堵,同时确保代理商达到最终目的地。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号