首页> 外文会议>IEEE International Conference on Robotics and Automation >Analyzing the Suitability of Cost Functions for Explaining and Imitating Human Driving Behavior based on Inverse Reinforcement Learning
【24h】

Analyzing the Suitability of Cost Functions for Explaining and Imitating Human Driving Behavior based on Inverse Reinforcement Learning

机译:基于逆向强化学习的成本函数在解释和模仿人类驾驶行为中的适用性分析

获取原文

摘要

Autonomous vehicles are sharing the road with human drivers. In order to facilitate interactive driving and cooperative behavior in dense traffic, a thorough understanding and representation of other traffic participants' behavior are necessary. Cost functions (or reward functions) have been widely used to describe the behavior of human drivers since they can not only explicitly incorporate the rationality of human drivers and the theory of mind (TOM), but also share similarity with the motion planning problem of autonomous vehicles. Hence, more human-like driving behavior and comprehensible trajectories can be generated to enable safer interaction and cooperation. However, the selection of cost functions in different driving scenarios is not trivial, and there is no systematic summary and analysis for cost function selection and learning from a variety of driving scenarios. In this work, we aim to investigate to what extent cost functions are suitable for explaining and imitating human driving behavior. Further, we focus on how cost functions differ from each other in different driving scenarios. Towards this goal, we first comprehensively review existing cost function structures in literature. Based on that, we point out required conditions for demonstrations to be suitable for inverse reinforcement learning (IRL). Finally, we use IRL to explore suitable features and learn cost function weights from human driven trajectories in three different scenarios.
机译:自动驾驶汽车正在与人类驾驶员共享道路。为了促进在拥挤的交通中的交互驾驶和协作行为,必须全面了解和表示其他交通参与者的行为。成本函数(或报酬函数)已被广泛用于描述人类驾驶员的行为,因为它们不仅可以明确地纳入人类驾驶员的理性和心理理论(TOM),而且与自主运动计划问题有着相似之处汽车。因此,可以生成更多类似人的驾驶行为和可理解的轨迹,以实现更安全的交互和合作。但是,在不同驾驶场景中选择成本函数并不是一件容易的事,并且没有针对成本函数选择和从各种驾驶场景中学习的系统总结和分析。在这项工作中,我们旨在研究成本函数在多大程度上适合于解释和模仿人的驾驶行为。此外,我们关注成本函数在不同驾驶场景中的区别。为了实现这一目标,我们首先全面回顾文献中现有的成本函数结构。在此基础上,我们指出了演示所需的条件,以适合进行逆向强化学习(IRL)。最后,我们使用IRL来探索合适的功能并在三种不同情况下从人类驱动的轨迹中学习成本函数权重。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号