首页> 外文会议>Annual meeting of the transportation research board;Transportation Research Board >Closed-Loop Optimal Freeway Ramp Metering using Continuous State Space Reinforcement Learning with Function Approximation
【24h】

Closed-Loop Optimal Freeway Ramp Metering using Continuous State Space Reinforcement Learning with Function Approximation

机译:使用具有函数逼近的连续状态空间强化学习的闭环最优高速公路匝道计量

获取原文

摘要

In recent years, Reinforcement Learning (RL), an Artificial Intelligence based learning method, has gained some interest among researchers in solving control systems problems. Although RL methods have been applied to different transportation problems such as ramp metering and traffic signal control; RL in its conventional form, with discrete state space representation, lacks learning efficiency and becomes intractable when applied to medium and large-scale transportation control problems. Continuous state space representation in RL problems implies direct representation of the problem’s continuous variables using function approximation techniques that has the potential to addresses some of the challenges associated with employing RL in large transportation networks. Function approximation methods, when properly designed, have the potential to result in 1) faster learning, 2) better performance, and 3) easier design/set up for RL control systems. In this paper, three function approximation techniques: k-nearest neighbor weighted average, multi-layer perceptron neural network, and linear model tree are developed and compared against the conventional table-based RL as a benchmark. The four approaches are applied to a ramp metering case study in the city of Toronto. The approaches are tested on a microsimulation model and compared using the following criteria: learning speed, design effort, computational requirements, and network performance. It is concluded that, for RL problems, the linear model tree method provides the best function approximation with minimal design effort given the noisy measurements in traffic control applications with more than 10 times faster leaning speed over the conventional table-based RL methods.
机译:近年来,强化学习(RL)是一种基于人工智能的学习方法, 研究人员对解决控制系统问题有些兴趣。尽管RL方法已经 适用于各种交通问题,例如坡道计量和交通信号控制; RL在 具有离散状态空间表示的常规形式,缺乏学习效率,并且变得 在应用于中型和大型运输控制问题时难以解决。连续状态空间 RL问题中的表示表示使用 函数近似技术,有可能解决一些相关的挑战 在大型运输网络中使用RL。适当时的函数近似方法 设计,有可能导致1)更快的学习,2)更好的性能以及3)更容易的设计/设置 用于RL控制系统。本文采用三种函数逼近技术:k最近邻 开发了加权平均,多层感知器神经网络和线性模型树,并 与传统的基于表格的RL作为基准进行了比较。这四种方法适用于 多伦多市的匝道计量案例研究。这些方法在微观仿真模型上进行了测试, 使用以下标准进行比较:学习速度,设计工作,计算要求和 网络性能。结论是,对于RL问题,线性模型树方法提供了最好的解决方案。 给定交通控制中的嘈杂测量值,以最小的设计工作量实现功能近似 应用程序的学习速度比传统的基于表的RL方法快10倍以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号