首页> 外文期刊>Quality Control, Transactions >Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning
【24h】

Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning

机译:富裕环境中的避免避免,具有深入的加强学习

获取原文
获取原文并翻译 | 示例
       

摘要

Collision avoidance algorithms are essential for safe and efficient robot operation among pedestrians. This work proposes using deep reinforcement (RL) learning as a framework to model the complex interactions and cooperation with nearby, decision-making agents, such as pedestrians and other robots. Existing RL-based works assume homogeneity of agent properties, use specific motion models over short timescales, or lack a principled method to handle a large, possibly varying number of agents. Therefore, this work develops an algorithm that learns collision avoidance among a variety of heterogeneous, non-communicating, dynamic agents without assuming they follow any particular behavior rules. It extends our previous work by introducing a strategy using Long Short-Term Memory (LSTM) that enables the algorithm to use observations of an arbitrary number of other agents, instead of a small, fixed number of neighbors. The proposed algorithm is shown to outperform a classical collision avoidance algorithm, another deep RL-based algorithm, and scales with the number of agents better (fewer collisions, shorter time to goal) than our previously published learning-based approach. Analysis of the LSTM provides insights into how observations of nearby agents affect the hidden state and quantifies the performance impact of various agent ordering heuristics. The learned policy generalizes to several applications beyond the training scenarios: formation control (arrangement into letters), demonstrations on a fleet of four multirotors and on a fully autonomous robotic vehicle capable of traveling at human walking speed among pedestrians.
机译:碰撞避免算法对于行人之间的安全和有效的机器人操作至关重要。这项工作建议使用深度加强(RL)学习作为模拟与附近的复杂互动和合作的框架,例如行人和其他机器人。现有的基于RL的作品假设代理属性的均匀性,在短时间间或缺乏特定运动模型,或者缺乏处理大,可能不同数量的药剂的原则方法。因此,这项工作开发了一种算法,该算法在不假设他们遵循任何特定行为规则的情况下,在不假设它们的情况下学习避免的碰撞避免。它通过使用长短期内存(LSTM)引入策略来扩展我们以前的工作,使算法能够使用任意数量的其他代理的观察,而不是小型固定数量的邻居。所提出的算法显示出优于经典碰撞避免算法,另一个基于深度RL的算法,以及比我们先前发布的基于学习的方法更好的代理数量(更少的冲突,更短的时间)。 LSTM的分析提供了对附近代理的观察影响隐藏状态的洞察,并量化各种代理订购启发式的性能影响。学习的政策推出了超出培训方案的几个应用程序:形成控制(安排给字母),在四个多陆运动员的舰队中的示范以及能够在行人中以人的步行速度行驶的全自治机器人车辆。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号