首页> 外文期刊>Transportation research >Cordon control with spatially-varying metering rates: A Reinforcement Learning approach
【24h】

Cordon control with spatially-varying metering rates: A Reinforcement Learning approach

机译:具有可变空间计量率的警戒线控制:一种强化学习方法

获取原文
获取原文并翻译 | 示例
       

摘要

The work explores how Reinforcement Learning can be used to re-time traffic signals around cordoned neighborhoods. An RL-based controller is developed by representing traffic states as graph-structured data and customizing corresponding neural network architectures to handle those data. The customizations enable the controller to: (i) model neighborhood-wide traffic based on directed-graph representations; (ii) use the representations to identify patterns in real-time traffic measurements; and (iii) capture those patterns to a spatial representation needed for selecting optimal cordon-metering rates. Input to the selection process also includes a total inflow to be admitted through a cordon. The rate is optimized in a separate process that is not part of the present work. Our RL-controller distributes that separately-optimized rate across the signalized street links that feed traffic through the cordon. The resulting metering rates vary from one feeder link to the next. The selection process can reoccur at short time intervals in response to changing traffic patterns. Once trained on a few cordons, the RL-controller can be deployed on cordons elsewhere in a city without additional training.This portability feature is confirmed via simulations of traffic on an idealized street network. The tests also indicate that the controller can reduce the network's vehicle hours traveled well beyond what can be achieved via spatially-uniform cordon metering. The extra reductions in VHT are found to grow larger when traffic exhibits greater in-homogeneities over the network.
机译:这项工作探索了如何使用强化学习来对封锁区域附近的交通信号重新计时。通过将交通状态表示为图结构数据并定制相应的神经网络体系结构来处理这些数据,从而开发了基于RL的控制器。通过自定义,控制器可以:(i)基于有向图表示对邻居范围的流量进行建模; (ii)使用这些表示来识别实时流量测量中的模式; (iii)将这些模式捕获为选择最佳警戒线测速所需的空间表示。选择过程的输入还包括要通过警戒线接纳的总流入量。在单独的过程中优化速率,这不是当前工作的一部分。我们的RL控制器在分别通过警戒线提供交通信号的街道链路上分配单独优化的费率。由此产生的计量速率从一个馈线链接到下一个馈线链接而变化。响应于变化的交通模式,选择过程可以在很短的时间间隔内重新发生。一旦经过了一些警戒线的训练,RL控制器就可以部署在城市其他地方的警戒线上,而无需额外的培训。这种可移植性功能通过在理想化的街道网络上模拟交通流量得到确认。测试还表明,该控制器可以减少网络的行车时间,远远超过通过空间均匀的警戒线计量所能达到的水平。当流量在网络上表现出更大的不均匀性时,VHT的额外减少会变得更大。

著录项

  • 来源
    《Transportation research》 |2019年第1期|358-369|共12页
  • 作者

    Ni Wei; Cassidy Michael J.;

  • 作者单位

    Univ Calif Berkeley, Dept Civil & Environm Engn, Berkeley, CA 94720 USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号