首页> 外文期刊>Transportation research >Learning how to dynamically route autonomous vehicles on shared roads
【24h】

Learning how to dynamically route autonomous vehicles on shared roads

机译:学习如何在共用道路上动态路由自动车辆

获取原文
获取原文并翻译 | 示例
       

摘要

Road congestion induces significant costs across the world, and road network disturbances, such as traffic accidents, can cause highly congested traffic patterns. If a planner had control over the routing of all vehicles in the network, they could easily reverse this effect. In a more realistic scenario, we consider a planner that controls autonomous cars, which are a fraction of all present cars. We study a dynamic routing game, in which the route choices of autonomous cars can be controlled and the human drivers react selfishly and dynamically. As the problem is prohibitively large, we use deep reinforcement learning to learn a policy for controlling the autonomous vehicles. This policy indirectly influences human drivers to route themselves in such a way that minimizes congestion on the network. To gauge the effectiveness of our learned policies, we establish theoretical results characterizing equilibria and empirically compare the learned policy results with best possible equilibria. We prove properties of equilibria on parallel roads and provide a polynomial-time optimization for computing the most efficient equilibrium. Moreover, we show that in the absence of these policies, high demand and network perturbations would result in large congestion, whereas using the policy greatly decreases the travel times by minimizing the congestion. To the best of our knowledge, this is the first work that employs deep reinforcement learning to reduce congestion by indirectly influencing humans' routing decisions in mixed-autonomy traffic.
机译:道路拥堵在世界各地诱发大量成本,道路网络障碍,如交通事故,可能导致高度拥挤的交通模式。如果策划者控制了网络中所有车辆的路由,则它们可以轻松逆转此效果。在更现实的情景中,我们考虑一个控制自动车辆的计划者,这是所有当前汽车的一小部分。我们研究动态路由游戏,其中可以控制自动驾驶汽车的路线选择,人类驱动程序自私地和动态反应。由于该问题的巨大,我们使用深度加强学习来学习控制自动车辆的政策。该政策间接影响人类驱动程序以这种方式使自己途径可最大限度地降低网络拥堵。为了衡量我们所知政策的有效性,我们建立了均衡的理论结果,并经验与最佳均衡的学习政策结果进行了比较。我们在平行道路上证明了均衡的特性,并提供了用于计算最有效的均衡的多项式优化。此外,我们表明,在没有这些政策的情况下,高需求和网络扰动会导致大型拥塞,而使用该政策将通过最小化拥塞来大大降低旅行时间。据我们所知,这是第一个采用深度加强学习的工作,以减少间接影响人类在混合自主交通中的沟通决策的拥堵。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号