Hou Xiaohui, Gan Minggang, Wu Wei, Zhao Shiyue, Ji Yuan, Chen Jie
IEEE Trans Cybern. 2024 Dec 30;PP. doi: 10.1109/TCYB.2024.3518697.
This study focuses on trajectory planning and motion control policies in autonomous racing, which necessitates pushing the capacity boundaries of racing vehicles to achieve maximum speeds and minimal lap times. We propose an innovative planning control framework that integrates risk-conscious mutations in jump-start reinforcement learning (RCM-JSRL) and nonlinear model predictive control (NMPC). The RCM-JSRL algorithm incorporates jump-start curriculum learning and the risk-conscious genetic algorithm into reinforcement learning, leveraging prior expert knowledge and a curiosity-driven exploration mechanism to enhance training efficiency while avoiding excessively conservative policy generation in high-complexity and high-risk scenarios. NMPC generates locally optimal control commands that adhere to vehicle dynamics constraints while following the designated trajectory. Following training on track maps with varying difficulty levels, the proposed controller successfully executes a superior policy compared to the guide policy, providing evidence of its effectiveness and scalability. It is our belief that this technology can be applied in everyday driving scenarios, improving efficiency under special conditions, ensuring stability in critical situations, and broadening the scope of autonomous driving applications.
本研究聚焦于自主赛车中的轨迹规划和运动控制策略,这需要突破赛车的性能极限以实现最高速度和最短圈速。我们提出了一种创新的规划控制框架,该框架将跳跃式强化学习中的风险感知变异(RCM-JSRL)与非线性模型预测控制(NMPC)相结合。RCM-JSRL算法将跳跃式课程学习和风险感知遗传算法融入强化学习,利用先验专家知识和好奇心驱动的探索机制来提高训练效率,同时避免在高复杂度和高风险场景中生成过于保守的策略。NMPC生成局部最优控制命令,在遵循指定轨迹的同时遵守车辆动力学约束。在不同难度级别的赛道地图上进行训练后,与引导策略相比,所提出的控制器成功执行了更优的策略,证明了其有效性和可扩展性。我们相信,这项技术可应用于日常驾驶场景,在特殊条件下提高效率,在关键情况下确保稳定性,并拓宽自动驾驶应用的范围。