基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

机构信息

Computer Science Department, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt.

Information Systems Department, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt.

出版信息

PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.

DOI:10.1371/journal.pone.0252754

PMID:34111168

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8191943/

Abstract

Deep Reinforcement Learning (DRL) enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to a given environment. The adaptation of hyperparameters has a great impact on the overall learning process and the learning processing times. Hyperparameters should be accurately estimated while training DRL algorithms, which is one of the key challenges that we attempt to address. This paper employs a swarm-based optimization algorithm, namely the Whale Optimization Algorithm (WOA), for optimizing the hyperparameters of the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve the optimum control strategy in an autonomous driving control problem. DDPG is capable of handling complex environments, which contain continuous spaces for actions. To evaluate the proposed algorithm, the Open Racing Car Simulator (TORCS), a realistic autonomous driving simulation environment, was chosen to its ease of design and implementation. Using TORCS, the DDPG agent with optimized hyperparameters was compared with a DDPG agent with reference hyperparameters. The experimental results showed that the DDPG's hyperparameters optimization leads to maximizing the total rewards, along with testing episodes and maintaining a stable driving policy.

摘要

深度强化学习 (DRL) 使代理能够根据精心设计的奖励函数做出决策，该函数适合特定环境，而无需与给定环境相关的任何先验知识。超参数的调整对整体学习过程和学习处理时间有很大影响。在训练 DRL 算法时，应该准确估计超参数，这是我们试图解决的关键挑战之一。本文采用基于群体的优化算法，即鲸鱼优化算法（WOA），来优化深度确定性策略梯度（DDPG）算法的超参数，以在自主驾驶控制问题中实现最佳控制策略。DDPG 能够处理包含动作连续空间的复杂环境。为了评估所提出的算法，选择了 Open Racing Car Simulator（TORCS），这是一个现实的自主驾驶模拟环境，因为它易于设计和实现。使用 TORCS，将具有优化超参数的 DDPG 代理与具有参考超参数的 DDPG 代理进行了比较。实验结果表明，DDPG 的超参数优化可实现最大化总奖励，同时测试回合并保持稳定的驾驶策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/52b4/8191943/0ceb98e790f6/pone.0252754.g001.jpg

相似文献

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。

PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.

Deep Reinforcement Learning on Autonomous Driving Policy With Auxiliary Critic Network.基于辅助评论家网络的自动驾驶策略深度强化学习

IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3680-3690. doi: 10.1109/TNNLS.2021.3116063. Epub 2023 Jul 6.

PORF-DDPG: Learning Personalized Autonomous Driving Behavior with Progressively Optimized Reward Function.PORF-DDPG：使用逐步优化的奖励函数学习个性化自主驾驶行为。

Sensors (Basel). 2020 Oct 1;20(19):5626. doi: 10.3390/s20195626.

Deep deterministic policy gradient algorithm: A systematic review.深度确定性策略梯度算法：一项系统综述。

Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.

A Multi-Task Fusion Strategy-Based Decision-Making and Planning Method for Autonomous Driving Vehicles.一种基于多任务融合策略的自动驾驶车辆决策与规划方法

Sensors (Basel). 2023 Aug 8;23(16):7021. doi: 10.3390/s23167021.

Autonomous Driving Control Based on the Technique of Semantic Segmentation.基于语义分割技术的自动驾驶控制。

Sensors (Basel). 2023 Jan 12;23(2):895. doi: 10.3390/s23020895.

Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments.基于深度确定性策略梯度的稀疏奖励环境下移动机器人自主驾驶。

Sensors (Basel). 2022 Dec 7;22(24):9574. doi: 10.3390/s22249574.

Predictive hierarchical reinforcement learning for path-efficient mapless navigation with moving target.具有移动目标的无图路径高效导航的预测分层强化学习。

Neural Netw. 2023 Aug;165:677-688. doi: 10.1016/j.neunet.2023.06.007. Epub 2023 Jun 10.

The Whale Optimization Algorithm Approach for Deep Neural Networks.鲸鱼优化算法在深度神经网络中的应用。

Sensors (Basel). 2021 Nov 30;21(23):8003. doi: 10.3390/s21238003.

Generalized Single-Vehicle-Based Graph Reinforcement Learning for Decision-Making in Autonomous Driving.基于广义单车图的强化学习在自动驾驶决策中的应用。

Sensors (Basel). 2022 Jun 29;22(13):4935. doi: 10.3390/s22134935.

引用本文的文献

Optimizing Autonomous Vehicle Performance Using Improved Proximal Policy Optimization.使用改进的近端策略优化来优化自动驾驶车辆性能。

Sensors (Basel). 2025 Mar 20;25(6):1941. doi: 10.3390/s25061941.

Adaptive control for circulating cooling water system using deep reinforcement learning.基于深度强化学习的循环冷却水系统自适应控制。

PLoS One. 2024 Jul 24;19(7):e0307767. doi: 10.1371/journal.pone.0307767. eCollection 2024.

Deep deterministic policy gradient algorithm: A systematic review.深度确定性策略梯度算法：一项系统综述。

Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.

Dynamic Bayesian network structure learning based on an improved bacterial foraging optimization algorithm.基于改进细菌觅食优化算法的动态贝叶斯网络结构学习。

Sci Rep. 2024 Apr 9;14(1):8266. doi: 10.1038/s41598-024-58806-0.

Enhancing the landing guidance of a reusable launch vehicle by improving genetic algorithm-based deep reinforcement learning using Hybrid Deterministic-Stochastic algorithm.利用混合确定性-随机算法改进基于遗传算法的深度强化学习，提高可重复使用运载火箭的着陆制导。

PLoS One. 2024 Feb 29;19(2):e0292539. doi: 10.1371/journal.pone.0292539. eCollection 2024.

Enhancing breast ultrasound segmentation through fine-tuning and optimization techniques: Sharp attention UNet.通过微调与优化技术增强乳腺超声分割：Sharp 注意力 UNet。

PLoS One. 2023 Dec 13;18(12):e0289195. doi: 10.1371/journal.pone.0289195. eCollection 2023.

High-Performance Computing Analysis and Location Selection of Logistics Distribution Center Space Based on Whale Optimization Algorithm.基于鲸鱼优化算法的物流配送中心空间高性能计算分析与选址。

Comput Intell Neurosci. 2022 Jun 22;2022:2055241. doi: 10.1155/2022/2055241. eCollection 2022.

Eight pruning deep learning models for low storage and high-speed COVID-19 computed tomography lung segmentation and heatmap-based lesion localization: A multicenter study using COVLIAS 2.0.用于低存储和高速 COVID-19 计算机断层扫描肺分割和基于热图的病变定位的八种剪枝深度学习模型：使用 COVLIAS 2.0 的多中心研究。

Comput Biol Med. 2022 Jul;146:105571. doi: 10.1016/j.compbiomed.2022.105571. Epub 2022 May 21.

A novel hybrid soft computing optimization framework for dynamic economic dispatch problem of complex non-convex contiguous constrained machines.一种用于复杂非凸连续约束机组动态经济调度问题的新型混合软计算优化框架。

PLoS One. 2022 Jan 26;17(1):e0261709. doi: 10.1371/journal.pone.0261709. eCollection 2022.

Android malware classification based on random vector functional link and artificial Jellyfish Search optimizer.基于随机向量函数链接和人工水母搜索优化器的安卓恶意软件分类。

PLoS One. 2021 Nov 19;16(11):e0260232. doi: 10.1371/journal.pone.0260232. eCollection 2021.

本文引用的文献

Nonlinear control of networked dynamical systems.网络化动态系统的非线性控制

IEEE Trans Netw Sci Eng. 2021 Jan-Mar;8(1):174-189. doi: 10.1109/tnse.2020.3032117. Epub 2020 Oct 19.

Grandmaster level in StarCraft II using multi-agent reinforcement learning.星际争霸 II 中的大师级水平使用多智能体强化学习。

Nature. 2019 Nov;575(7782):350-354. doi: 10.1038/s41586-019-1724-z. Epub 2019 Oct 30.

A Systematic and Meta-Analysis Survey of Whale Optimization Algorithm.鲸鱼优化算法的系统与元分析调查

Comput Intell Neurosci. 2019 Apr 28;2019:8718571. doi: 10.1155/2019/8718571. eCollection 2019.

Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。

Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.

Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。

Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.

Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献