首页> 外文学位 >Mastraf: A decentralized multi-agent system for network-wide traffic signal control with dynamic coordination.

【24h】

Mastraf: A decentralized multi-agent system for network-wide traffic signal control with dynamic coordination.

机译：Mastraf：一种分散的多主体系统，用于动态协调网络范围内的交通信号控制。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Continuous increases in traffic volume and limited available capacity in the roadway system have created a need for improved traffic control. From traditional pre-timed isolated signals to actuated and coordinated corridors, traffic control for urban networks has evolved into more complex adaptive signal control systems. However, unexpected traffic fluctuations, rapid changes in traffic demands, oversaturation, the occurrence of incidents, and adverse weather conditions, among others, significantly impact the traffic network operation in ways that current control systems cannot always cope with.;On the other hand, strategies for traffic control based on developments from the field of machine learning can provide promising alternative solutions, particularly those that make use of unsupervised learning such as reinforcement learning (RL) - also referred as approximate dynamic programming (ADP) in some research communities. For the traffic control problem, examples of convenient RL algorithms are the off-policy Q-learning and the ADP using a post decision state variable, since they address processes with sequential decision making, do not need to compute transition probabilities, and are well suited for high dimensional spaces.;A series of benefits are expected from these algorithms in the traffic control domain: 1) no need of prediction models to transition traffic over time and estimate the best actions; 2) availability of cost-to-go estimates at any time (appropriate for real-time applications); 3) selfevolving policies; and 4) flexibility to make use of new sources of information part of emergent Intelligent Transportation Systems (ITS) such as mobile vehicle detectors (Bluetooth and GPS vehicle locators).;Given the potential benefits of these strategies, this research proposes MASTraf: a decentralized Multi-Agent System for network-wide Traffic signal control with dynamic coordination. MASTraf is designed to capture the behavior of the environment and take decisions based on situations directly observed by RL agents. Also, agents can communicate with each other, exploring the effects of temporary coalitions or subgroups of intersections as a mechanism for coordination.;Separate MASTraf implementations with similar state and reward functions using Qlearning and ADP were tested using a microscopic traffic simulator (VISSIM) and real-time manipulation of the traffic signals through the software's COM interface. Testing was conducted to determine the performance of the agents in scenarios with increasing complexity, from a single intersection, to arterials and networks, both in undersaturated and oversaturated conditions.;Results show that the multi-agent system provided by MASTraf improves its performance as the agents accumulate experience, and the system was able to efficiently manage the traffic signals of simple and complex scenarios. Exploration of the policies generated by MASTraf showed that the agents followed expected behavior by providing green to greater vehicle demands and accounting for the effects of blockages and lost time. The performance of MASTraf was on par with current state of practice tools for finding signal control settings, but MASTraf can also adapt to changes in demands and driver behavior by adjusting the signal timings in real-time, thus improving coordination and preventing queue spillbacks and green starvation.;A strategy for signal coordination was also tested in one of the MASTraf implementations, showing increased throughput and reduced number of stops, as expected. The coordination employed a version of the max-plus algorithm embedded in the reward structure, acting as a bias towards improved coordination. The response of the system using imprecise detector data, in the form of coarse aggregation, showed that the system was able to handle oversaturation under such conditions. Even when the data had only 25% of the resolution of the original implementation, the system throughput was only reduced by 5% and the number of stops per vehicle was increased by 8%.;The state and reward formulations allowed for a simple function approximation method in order to reduce the memory requirements for storing the state space, and also to create a form of generalization for states that have not been visited or that have not been experienced enough. Given the discontinuities in the reward function generated by penalties for blockages and lost times, the value approximation was conducted through a series of functions for each action and each of the conditions before and after a discontinuity. The policies generated using MASTraf with a function approximation were analyzed for different intersections in the network, showing agent behavior that reflected the principles formulated in the original problem using lookup tables, including right of way assignment based on expected rewards with consideration of penalties such as lost time. In terms of system performance, MASTraf with function approximation resulted in average reductions of 1% in the total system throughput and 3.6% increases in the number of stops per vehicle, when compared to the implementation using lookup tables on a congested network of 20 intersections.

机译：道路系统中交通量的不断增加和可用容量的有限，产生了对改善交通控制的需求。从传统的预定时隔离信号到启动和协调的走廊，城市网络的交通控制已发展成为更加复杂的自适应信号控制系统。但是，交通意外波动，交通需求快速变化，饱和度过高，事件的发生以及恶劣的天气条件等，以当前控制系统无法始终应对的方式，对交通网络的运行产生了重大影响。基于机器学习领域发展的交通控制策略可以提供有前途的替代解决方案，尤其是那些利用无监督学习的方法，例如强化学习（RL）-在某些研究社区中也称为近似动态编程（ADP）。对于流量控制问题，便捷的RL算法的示例是非政策性Q学习和使用后期决策状态变量的ADP，因为它们可处理具有顺序决策的流程，无需计算过渡概率，并且非常适合这些算法在交通控制领域中有望带来一系列好处：1）无需预测模型即可随时间推移过渡交通并估算最佳行动； 2）随时可以获得成本估算（适用于实时应用）； 3）自我发展的政策； 4）灵活利用新兴智能交通系统（ITS）的新信息源，例如移动车辆检测器（蓝牙和GPS车辆定位器）。；鉴于这些策略的潜在优势，本研究提出了MASTraf：去中心化具有动态协调功能的全网络交通信号控制多智能系统。 MASTraf旨在捕获环境行为并根据RL代理直接观察到的情况做出决策。代理还可以彼此通信，探索临时联盟或交叉点子集的影响作为一种协调机制。使用微观交通模拟器（VISSIM）对使用Qlearning和ADP的具有类似状态和奖励功能的单独MASTraf实现进行了测试，并且通过软件的COM接口实时处理交通信号。进行了测试，以确定在饱和度和过饱和条件下，从单个交叉点到动脉和网络的复杂性不断增加的情况下，代理的性能。结果表明，MASTraf提供的多代理系统随着性能的提高而提高了性能。代理积累了经验，该系统能够有效管理简单和复杂场景的交通信号。对MASTraf生成的策略的探索表明，代理商通过提供绿色环保车辆来满足更大的需求，并考虑了堵塞和浪费时间的影响，从而遵循了预期的行为。 MASTraf的性能与用于查找信号控制设置的实践工具的当前状态相提并论，但是MASTraf也可以通过实时调整信号时序来适应需求和驾驶员行为的变化，从而改善协调性并防止队列溢出和绿色故障。在一项MASTraf实施中还测试了一种信号协调策略，与预期的一样，该方法可提高吞吐量并减少停车次数。协调采用了在奖励结构中嵌入的max-plus算法的一种版本，作为改进协调的一种偏见。使用不精确检测器数据的系统响应（以粗略聚合的形式）表明，该系统能够处理这种情况下的过饱和。即使数据只有原始实现的分辨率的25％，系统吞吐量也仅减少5％，每辆车的停车次数增加8％。状态和奖励公式允许简单的函数近似为了减少用于存储状态空间的内存需求，并为尚未访问或未充分体验的状态创建一种通用形式。给定因阻塞和损失时间的惩罚而产生的奖励函数中的不连续性，通过一系列函数针对不连续前后的每个动作和每个条件进行值逼近。使用MASTraf生成的具有函数逼近的策略针对网络中的不同交叉点进行了分析，显示了代理行为，这些行为反映了使用查找表反映原始问题中制定的原理的情况，包括基于预期奖励并考虑了损失等惩罚的通行权分配时间。在系统性能方面与在20个交叉路口的拥挤网络中使用查找表的实现相比，具有函数逼近功能的MASTraf导致系统总吞吐量平均减少1％，每辆车的停车次数增加3.6％。

著录项

作者
Medina, Juan C.;
展开▼
作者单位

University of Illinois at Urbana-Champaign.;

展开▼
授予单位 University of Illinois at Urbana-Champaign.;
学科 Engineering Civil.
学位 Ph.D.
年度 2013
页码 159 p.
总页数 159
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:41:19

相似文献

外文文献
中文文献
专利

1. Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning [J] . Li Zhenning, Yu Hao, Zhang Guohui, Transportation research . 2021,第Apra期

机译：网络范围的交通信号控制优化使用多功能深度增强学习
2. Use of System of Systems and Decentralized Optimization Concepts for Integrated Traffic Control via Dynamic Signalization and Embedded Speed Recommendation [J] . Ugnius Aliubavicius, Julia Obermaier, Walid Fourati, Transportation Research Procedia . 2016,第7期

机译：通过动态信号化和嵌入式速度建议将系统系统和分散优化概念用于综合交通控制
3. A semi-decentralized feudal multi-agent learned-goal algorithm for multi-intersection traffic signal control [J] . Yang Shantian, Yang Bo Knowledge-Based Systems . 2021,第Feba15期

机译：多交叉路口交通信号控制的半分散式封建多智能经纪算法
4. Multi-Agent Deep Reinforcement Learning for Decentralized Cooperative Traffic Signal Control [C] . Yang Zhao, Jian-Ming Hu, Ming-Yang Gao, COTA International Conference of Transportation Professionals . 2020

机译：分散性协作交通信号控制的多功能深度加固学习
5. Development of dynamic real-time integration of transit signal priority in coordinated traffic signal control system using genetic algorithms and artificial neural networks. [D] . Ghanim, Mohammad Shareef. 2008

机译：利用遗传算法和人工神经网络在交通信号协调控制系统中动态实时集成交通信号优先级。
6. Adaptive Traffic Signal Control: Game-Theoretic Decentralized vs. Centralized Perimeter Control [O] . Maha Elouni, Hossam M. Abdelghaffar, Hesham A. Rakha 2021

机译：自适应交通信号控制：游戏理论分散与集中边界控制
7. Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning [O] . Zhenning Li, Hao Yu, Guohui Zhang, 2021

机译：网络范围的交通信号控制优化使用多功能深度增强学习
8. Development and Evaluation of a Multi-Agent Based Neuro-Fuzzy Arterial Traffic Signal Control System; Technical rept. Sep 2006-Sep 2007 [R] . Zhang, Y., Xie, Y., Ye, Z. 2007

机译：基于多智能体神经模糊动脉交通信号控制系统的开发与评估2。技术部门。 2006年9月至2007年9月

Mastraf: A decentralized multi-agent system for network-wide traffic signal control with dynamic coordination.

摘要

著录项

相似文献

相关主题

期刊订阅