Alohali Manal Abdullah, Alqahtani Hamed, Darem Abdulbasit, Abdullah Monir, Nam Yunyoung, Abouhawwash Mohamed
Department of Information Systems, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Department of Information Systems, Abha, Saudi Arabia.
PeerJ Comput Sci. 2025 Jun 10;11:e2823. doi: 10.7717/peerj-cs.2823. eCollection 2025.
Cyber-physical systems (CPSs) in autonomous vehicles must handle highly dynamic and uncertain settings, where unanticipated impediments, shifting traffic conditions, and environmental changes all provide substantial decision-making issues. Deep reinforcement learning (DRL) has emerged as a strong tool for dealing with such uncertainty, yet current DRL models struggle to ensure safety and optimal behaviour in indeterminate settings due to the difficulties of understanding dynamic reward systems. To address these constraints, this study incorporates double deep Q networks (DDQN) to improve the agent's adaptability under uncertain driving conditions. A structured reward system is established to accommodate real-time fluctuations, resulting in safer and more efficient decision-making. The study acknowledges the technological limitations of automobile CPSs and investigates hardware acceleration as a potential remedy in addition to algorithmic enhancements. Because of their post-manufacturing adaptability, parallel processing capabilities, and reconfigurability, field programmable gate arrays (FPGAs) are used to execute reinforcement learning in real-time. Using essential parameters, including collision rate, behaviour similarity, travel distance, speed control, total rewards, and timesteps, the suggested method is thoroughly tested in the TORCS Racing Simulator. The findings show that combining FPGA-based hardware acceleration with DDQN successfully improves computational efficiency and decision-making reliability, tackling significant issues brought on by uncertainty in autonomous driving CPSs. In addition to advancing reinforcement learning applications in CPSs, this work opens up possibilities for future investigations into real-world generalisation, adaptive reward mechanisms, and scalable hardware implementations to further reduce uncertainty in autonomous systems.
自动驾驶车辆中的网络物理系统(CPS)必须应对高度动态和不确定的环境,其中意外障碍、不断变化的交通状况和环境变化都会带来重大的决策问题。深度强化学习(DRL)已成为处理此类不确定性的强大工具,但由于理解动态奖励系统存在困难,当前的DRL模型在不确定环境中难以确保安全性和最优行为。为了解决这些限制,本研究采用双深度Q网络(DDQN)来提高智能体在不确定驾驶条件下的适应性。建立了一个结构化奖励系统以适应实时波动,从而实现更安全、更高效的决策。该研究认识到汽车CPS的技术局限性,并除了算法改进之外,还研究了硬件加速作为一种潜在的解决方案。由于现场可编程门阵列(FPGA)具有制造后适应性、并行处理能力和可重构性,因此被用于实时执行强化学习。使用包括碰撞率、行为相似度、行驶距离、速度控制、总奖励和时间步长等关键参数,在TORCS赛车模拟器中对所提出的方法进行了全面测试。结果表明,将基于FPGA的硬件加速与DDQN相结合,成功提高了计算效率和决策可靠性,解决了自动驾驶CPS不确定性带来的重大问题。除了推进CPS中的强化学习应用之外,这项工作还为未来对实际应用的推广、自适应奖励机制和可扩展硬件实现的研究开辟了可能性,以进一步降低自主系统中的不确定性。