首页> 外文会议>IEEE Conference on Computer Communications Workshops >Engineering A Large-Scale Traffic Signal Control: A Multi-Agent Reinforcement Learning Approach
【24h】

Engineering A Large-Scale Traffic Signal Control: A Multi-Agent Reinforcement Learning Approach

机译:工程大规模的交通信号控制:多功能加强学习方法

获取原文

摘要

Reinforcement learning is of vital significance in machine learning and is also a promising approach for traffic signal control in urban road networks with assistance of deep neural networks. However, in a large scale urban network, the centralized reinforcement learning approach is beset with difficulties due to the extremely high dimension of joint action space. The multi-agent reinforcement learning (MARL) approach overcomes the high dimension problem by employing distributed local agents whose action space is much smaller. Even though, MARL approach introduces another issue that multiple agents interact with environment simultaneously causing its instability so that training each agent independently may not converge. This paper presents an actor-critic based decentralized MARL approach to control traffic signal which overcomes the shortcomings of both centralized RL approach and independent MARL approach. In particular, a distributed critic network is designed which overcomes the difficulty to train a large-scale neural network in centralized RL approach. Moreover, a difference reward method is proposed to evaluate the contribution of each agent, which accelerates the convergence of algorithm and makes agents optimize policy in a more accurate direction. The proposed MARL approach is compared against the fully independent approach and the centralized learning approach in a grid network. Simulation results demonstrate its effectiveness in terms of average travel speed, travel delay and queue length over other MARL algorithms.
机译:加固学习对机器学习具有重要意义,也是深度神经网络的帮助下城市道路网络中交通信号控制的有希望的方法。然而,在大型城市网络中,由于联合动作空间极高的尺寸,集中式加强学习方法具有困难。多功能增强学习(MARL)方法通过采用分布式本地代理来克服高维问题,其行动空间要小得多。即使,Marl方法也介绍了另一个问题,即多个代理与环境相互作用,同时导致其不稳定,使每个代理独立培训可能不会收敛。本文提出了一种基于演员批评的分散式MARL方法来控制交通信号,克服了集中式RL方法的缺点和独立的MARL方法。特别是,设计了一种分布式批评网络,克服了难以在集中式RL方法中训练大规模神经网络的困难。此外,提出了一种差异奖励方法来评估每个试剂的贡献,这加速了算法的收敛性,并使代理优化了更精确的方向。将拟议的MARL方法与网格网络中的完全独立的方法和集中式学习方法进行比较。仿真结果表明其在其他MARL算法上的平均旅行速度,旅行延迟和队列长度方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号