Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty

机译：基于区域的不确定性区域的多态路径研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the problem of multiple agents finding their paths from respective sources to destination nodes in a graph (also called MAPF). Most existing approaches assume that all agents move at fixed speed, and that a single node accommodates only a single agent. Motivated by the emerging applications of autonomous vehicles such as drone traffic management, we present zone-based path finding (or ZBPF) where agents move among zones, and agents' movements require uncertain travel time. Furthermore, each zone can accommodate multiple agents (as per its capacity). We also develop a simulator for ZBPF which provides a clean interface from the simulation environment to learning algorithms. We develop a novel formulation of the ZBPF problem using difference-of-convex functions (DC) programming. The resulting approach can be used for policy learning using samples from the simulator. We also present a multiagent credit assignment scheme that helps our learning approach converge faster. Empirical results in a number of 2D and 3D instances show that our approach can effectively minimize congestion in zones, while ensuring agents reach their final destinations.

机译：我们解决了多个代理在图表中从各自的来源找到其路径的问题（也称为MAPF）。大多数现有方法假设所有代理以固定速度移动，并且单个节点仅适用于单个代理。由无人机交通管理等自治车辆的新出现应用，我们呈现基于区域的路径查找（或ZBPF），其中代理在区域之间移动，代理运动需要不确定的旅行时间。此外，每个区域可以容纳多个代理（根据其容量）。我们还为ZBPF开发了一个模拟器，它提供了从仿真环境到学习算法的干净接口。我们使用凸函数（DC）编程开发了对ZBPF问题的新颖制定。由此产生的方法可用于使用模拟器的样本来学习策略学习。我们还提出了一种多读信用分配方案，可以帮助我们的学习方法更快地收敛。在许多2D和3D实例中的经验结果表明，我们的方法可以有效地减少区域中的拥堵，同时确保代理商达到最终目的地。

著录项

来源
《International Conference on Automated Planning and Scheduling》|2020年|598p|共9页
会议地点
作者
Jiajing Ling; Tarun Gupta; Akshat Kumar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP2-53;
关键词

相似文献

外文文献
中文文献
专利

1. Reinforcement signal communication based multiagent reinforcement learning [J] . Tomohiro Yamaguchi 電子情報通信学会技術研究報告. オフィスシステム . 2000,第197期

机译：基于增强信号通信的多主体增强学习
2. Reinforcement signal communication based multiagent reinforcement learning [J] . Tomohiro Yamaguchi 電子情報通信学会技術研究報告. オフィスシステム . 2000,第197期

机译：基于增强信号通信的多主体增强学习
3. Reinforcement signal communication based multiagent reinforcement learning [J] . Tomohiro Yamaguchi 電子情報通信学会技術研究報告. オフィスシステム . 2000,第197期

机译：基于加强信号通信的多透根钢筋学习
4. Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty [C] . Jiajing Ling, Tarun Gupta, Akshat Kumar International Conference on Automated Planning and Scheduling . 2020

机译：基于区域的不确定性区域的多态路径研究
5. Explaining Collective Behavior with Dynamical Systems: Spatial Gradient Sensing in Eukaryotic Chemotaxis and Learning Dynamics in Multiagent Reinforcement Learning [D] . Shams, Daniel . 2019

机译：用动力系统解释集体行为：多核化趋化性的空间梯度传感和多核强化学习中的学习动态
6. Multiagent cooperation and competition with deep reinforcement learning [O] . Ardi Tampuu, Tambet Matiisen, Dorian Kodelja, -1

机译：多主体合作与竞争与深度强化学习
7. Best Response Bayesian Reinforcement Learning for Multiagent Systems with State Uncertainty [O] . Oliehoek FA, Amato C 2014

机译：具有状态不确定性的多主体系统的最佳响应贝叶斯强化学习

Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty

摘要

著录项

相似文献

相关主题

期刊订阅