• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

奖励函数对交通信号控制强化学习的影响分析。

Effects analysis of reward functions on reinforcement learning for traffic signal control.

机构信息

Department of Transportation Engineering, University of Seoul, Seoul, Korea.

Civil and Environmental Engineering, University of Windsor, Windsor, Canada.

出版信息

PLoS One. 2022 Nov 21;17(11):e0277813. doi: 10.1371/journal.pone.0277813. eCollection 2022.

DOI:10.1371/journal.pone.0277813
PMID:36409713
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9678263/
Abstract

The increasing traffic demand in urban areas frequently causes traffic congestion, which can be managed only through intelligent traffic signal controls. Although many recent studies have focused on reinforcement learning for traffic signal control (RL-TSC), most have focused on improving performance from an intersection perspective, targeting virtual simulation. The performance indexes from intersection perspectives are averaged by the weighted traffic flow; therefore, if the balance of each movement is not considered, the green time may be overly concentrated on the movements of heavy flow rates. Furthermore, as the ultimate purpose of traffic signal control research is to apply these controls to the real-world intersections, it is necessary to consider the real-world constraints. Hence, this study aims to design RL-TSC considering real-world applicability and confirm the appropriate design of the reward function. The limitations of the detector in the real world and the dual-ring traffic signal system are taken into account in the model design to facilitate real-world application. To design the reward for balancing traffic movements, we define the average delay weighted by traffic volume per lane and entropy of delay in the reward function. Model training is performed at the prototype intersection for ensuring scalability to multiple intersections. The model after prototype pre-training is evaluated by applying it to a network with two intersections without additional training. As a result, the reward function considering the equality of traffic movements shows the best performance. The proposed model reduces the average delay by more than 7.4% and 15.0% compared to the existing real-time adaptive signal control at two intersections, respectively.

摘要

城市地区不断增长的交通需求经常导致交通拥堵,只能通过智能交通信号控制来管理。尽管最近的许多研究都集中在交通信号控制的强化学习(RL-TSC)上,但大多数研究都侧重于从交叉口的角度提高性能,针对虚拟仿真。从交叉口角度的性能指标通过加权交通流量平均分配;因此,如果不考虑每个运动的平衡,绿灯时间可能过于集中在高流量率的运动上。此外,由于交通信号控制研究的最终目的是将这些控制应用于现实世界的交叉口,因此需要考虑现实世界的约束条件。因此,本研究旨在设计考虑现实世界适用性的 RL-TSC,并确认奖励功能的适当设计。在模型设计中考虑了现实世界中探测器的局限性和双环交通信号系统,以方便现实世界的应用。为了设计平衡交通运动的奖励,我们在奖励函数中定义了按每个车道的交通量加权的平均延迟和延迟熵。通过在原型交叉口进行模型训练,以确保可扩展到多个交叉口。在原型预训练后,通过将其应用于没有额外训练的两个交叉口的网络来评估模型。结果表明,考虑交通运动平等性的奖励函数表现出最佳性能。与两个交叉口的现有实时自适应信号控制相比,所提出的模型分别减少了平均延迟 7.4%和 15.0%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/16e0cf50b2f9/pone.0277813.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/aa4bdfc63add/pone.0277813.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/49a4e215bfad/pone.0277813.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/6af3ce394c41/pone.0277813.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/892b14a6463f/pone.0277813.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/fa9832dc65b3/pone.0277813.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/31e76ae7b711/pone.0277813.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/b472f2d96c7a/pone.0277813.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/06a2a63c2c89/pone.0277813.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/ede945c4244c/pone.0277813.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/58e2c2ce9d05/pone.0277813.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/16e0cf50b2f9/pone.0277813.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/aa4bdfc63add/pone.0277813.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/49a4e215bfad/pone.0277813.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/6af3ce394c41/pone.0277813.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/892b14a6463f/pone.0277813.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/fa9832dc65b3/pone.0277813.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/31e76ae7b711/pone.0277813.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/b472f2d96c7a/pone.0277813.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/06a2a63c2c89/pone.0277813.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/ede945c4244c/pone.0277813.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/58e2c2ce9d05/pone.0277813.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18b6/9678263/16e0cf50b2f9/pone.0277813.g011.jpg

相似文献

1
Effects analysis of reward functions on reinforcement learning for traffic signal control.奖励函数对交通信号控制强化学习的影响分析。
PLoS One. 2022 Nov 21;17(11):e0277813. doi: 10.1371/journal.pone.0277813. eCollection 2022.
2
Biased Pressure: Cyclic Reinforcement Learning Model for Intelligent Traffic Signal Control.压力偏差:智能交通信号控制的循环强化学习模型。
Sensors (Basel). 2022 Apr 6;22(7):2818. doi: 10.3390/s22072818.
3
A scalable approach to optimize traffic signal control with federated reinforcement learning.一种通过联邦强化学习优化交通信号控制的可扩展方法。
Sci Rep. 2023 Nov 6;13(1):19184. doi: 10.1038/s41598-023-46074-3.
4
Self-learning adaptive traffic signal control for real-time safety optimization.自学习自适应交通信号控制实时安全优化。
Accid Anal Prev. 2020 Oct;146:105713. doi: 10.1016/j.aap.2020.105713. Epub 2020 Aug 18.
5
How to Improve Urban Intelligent Traffic? A Case Study Using Traffic Signal Timing Optimization Model Based on Swarm Intelligence Algorithm.如何改善城市智能交通?基于群体智能算法的交通信号定时优化模型的案例研究
Sensors (Basel). 2021 Apr 8;21(8):2631. doi: 10.3390/s21082631.
6
Real-time signal-vehicle coupled control: An application of connected vehicle data to improve intersection safety.实时信号-车辆耦合控制:利用车联数据提高交叉口安全的应用。
Accid Anal Prev. 2021 Nov;162:106389. doi: 10.1016/j.aap.2021.106389. Epub 2021 Sep 21.
7
Cooperative Traffic Signal Control with Traffic Flow Prediction in Multi-Intersection.多交叉口协同交通信号控制与交通流预测。
Sensors (Basel). 2019 Dec 24;20(1):137. doi: 10.3390/s20010137.
8
Multi-Objective Optimization Method for Signalized Intersections in Intelligent Traffic Network.智能交通网络中信号交叉口的多目标优化方法
Sensors (Basel). 2023 Jul 11;23(14):6303. doi: 10.3390/s23146303.
9
Deep Reinforcement Learning for Traffic Signal Control Model and Adaptation Study.深度强化学习在交通信号控制模型及自适应中的研究。
Sensors (Basel). 2022 Nov 11;22(22):8732. doi: 10.3390/s22228732.
10
MARLens: Understanding Multi-Agent Reinforcement Learning for Traffic Signal Control via Visual Analytics.MARLens:通过视觉分析理解用于交通信号控制的多智能体强化学习
IEEE Trans Vis Comput Graph. 2025 Jul;31(7):4018-4033. doi: 10.1109/TVCG.2024.3392587.

引用本文的文献

1
MAGT-toll: A multi-agent reinforcement learning approach to dynamic traffic congestion pricing.MAGT-toll:一种用于动态交通拥堵收费的多智能体强化学习方法。
PLoS One. 2024 Nov 18;19(11):e0313828. doi: 10.1371/journal.pone.0313828. eCollection 2024.

本文引用的文献

1
Cooperative Deep Reinforcement Learning for Large-Scale Traffic Grid Signal Control.用于大规模交通网格信号控制的协作深度强化学习
IEEE Trans Cybern. 2020 Jun;50(6):2687-2700. doi: 10.1109/TCYB.2019.2904742. Epub 2019 Mar 29.