Closed-Loop Optimal Freeway Ramp Metering using Continuous State Space Reinforcement Learning with Function Approximation

机译：使用具有函数逼近的连续状态空间强化学习的闭环最优高速公路匝道计量

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, Reinforcement Learning (RL), an Artificial Intelligence based learning method, has gained some interest among researchers in solving control systems problems. Although RL methods have been applied to different transportation problems such as ramp metering and traffic signal control; RL in its conventional form, with discrete state space representation, lacks learning efficiency and becomes intractable when applied to medium and large-scale transportation control problems. Continuous state space representation in RL problems implies direct representation of the problem’s continuous variables using function approximation techniques that has the potential to addresses some of the challenges associated with employing RL in large transportation networks. Function approximation methods, when properly designed, have the potential to result in 1) faster learning, 2) better performance, and 3) easier design/set up for RL control systems. In this paper, three function approximation techniques: k-nearest neighbor weighted average, multi-layer perceptron neural network, and linear model tree are developed and compared against the conventional table-based RL as a benchmark. The four approaches are applied to a ramp metering case study in the city of Toronto. The approaches are tested on a microsimulation model and compared using the following criteria: learning speed, design effort, computational requirements, and network performance. It is concluded that, for RL problems, the linear model tree method provides the best function approximation with minimal design effort given the noisy measurements in traffic control applications with more than 10 times faster leaning speed over the conventional table-based RL methods.

机译：近年来，强化学习（RL）是一种基于人工智能的学习方法，研究人员对解决控制系统问题有些兴趣。尽管RL方法已经适用于各种交通问题，例如坡道计量和交通信号控制; RL在具有离散状态空间表示的常规形式，缺乏学习效率，并且变得在应用于中型和大型运输控制问题时难以解决。连续状态空间 RL问题中的表示表示使用函数近似技术，有可能解决一些相关的挑战在大型运输网络中使用RL。适当时的函数近似方法设计，有可能导致1）更快的学习，2）更好的性能以及3）更容易的设计/设置用于RL控制系统。本文采用三种函数逼近技术：k最近邻开发了加权平均，多层感知器神经网络和线性模型树，并与传统的基于表格的RL作为基准进行了比较。这四种方法适用于多伦多市的匝道计量案例研究。这些方法在微观仿真模型上进行了测试，使用以下标准进行比较：学习速度，设计工作，计算要求和网络性能。结论是，对于RL问题，线性模型树方法提供了最好的解决方案。给定交通控制中的嘈杂测量值，以最小的设计工作量实现功能近似应用程序的学习速度比传统的基于表的RL方法快10倍以上。

著录项

来源
《Annual meeting of the transportation research board;Transportation Research Board》|2014年|1-16|共16页
会议地点
作者
Kasra Rezaee; Baher Abdulhai; Hossam Abdelgawad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Novel and efficient local coordinated freeway ramp metering strategy with simultaneous perturbation stochastic approximation-based parameter learning [J] . Xinjie Zhao, Jianxin Xu, Srinivasan D. Intelligent Transport Systems, IET . 2014,第7期

机译：基于同时扰动随机逼近的参数学习的新型高效局部协调高速公路匝道计量策略
2. Efficient reinforcement learning in continuous state and action spaces with Dyna and policy approximation [J] . Zhong Shan, Liu Quan, Zhang Zongzhang, Frontiers of computer science in China . 2019,第1期

机译：使用Dyna和策略逼近在连续状态和动作空间中进行有效的强化学习
3. Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions [J] . Tamosiunaite M, Asfour T, Worgotter F Biological Cybernetics: Communication and Control in Organisms and Automata: = Nachrichtenubertragung, Nachrichtenverarbeitung, Steuerung und Regelung in Organismen und in Automaten . 2009,第3期

机译：通过使用连续动作的基于接受域的函数逼近方法，通过强化学习来学习达到
4. Closed-Loop Optimal Freeway Ramp Metering using Continuous State Space Reinforcement Learning with Function Approximation [C] . Kasra Rezaee, Baher Abdulhai, Hossam Abdelgawad Annual meeting of the transportation research board . 2014

机译：闭环最佳高速公路斜坡计量使用连续状态空间加固学习功能近似
5. Reinforcement learning optimal adaptive control strategy for freeway ramp metering. [D] . Veljanovska, Kostandina. 2003

机译：用于高速公路匝道计量的强化学习最佳自适应控制策略。
6. Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions [O] . Minija Tamosiunaite, Tamim Asfour, Florentin Wörgötter -1

机译：通过使用连续动作的基于受体场的函数逼近方法通过强化学习来学习达到
7. Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions [O] . Minija Tamosiunaite, Tamim Asfour, Florentin Wörgötter 2009

机译：通过使用连续动作的基于受体场的函数逼近方法，通过强化学习来学习达到
8. Application of Pattern Recognition to Forecast Congested Conditions on the Freeway for Use in Ramp Metering. Volume 2. (Freeway and Ramp Real Time Forecasting) [R] . Nihan, N. L., Babla, M. D. 1993

机译：模式识别在高速公路匝道计量拥挤条件预测中的应用。第2卷（高速公路和坡道实时预报）

Closed-Loop Optimal Freeway Ramp Metering using Continuous State Space Reinforcement Learning with Function Approximation

摘要

著录项

相似文献

相关主题

期刊订阅