Qiu Jiandong, Xu Shusheng, Tang Minan, Liu Jiaxuan, Song Hailong
School of Mechanical Engineering, Lanzhou Jiaotong University, Lanzhou, China.
School of Automation and Electrical Engineering, Lanzhou Jiaotong University, Lanzhou, China.
PLoS One. 2025 Apr 8;20(4):e0320762. doi: 10.1371/journal.pone.0320762. eCollection 2025.
Shunting operation plan is the main daily work of the freight train depot, the optimization of shunting operation plan is of great significance to improve the efficiency of railway operation and production and transportation. In this paper, the deep reinforcement learning (DRL) environment and model of shunting operation problem are constructed by three elements: action, state and reward, taking shunting locomotive as the agent, the lane number of the fall-down train group as the action, the fall-down conditions of the train group as the state, and design the reward function based on the total number of shunting hooks generated after the group's descent and reorganization. The model is solved using the Deep Q network (DQN) algorithm with the objective of minimizing the number of shunting hooks, the optimal shunting operation plan can be solved after sufficient training. DQN is verified to be effective through example simulations: Compared to the overall planning and coordinating (OPC) method, DQN produces a shunting operation plan that occupies fewer lanes and produces 10% fewer total shunting hooks. Compared to the binary search tree (BST) algorithm, DQN produces 5% fewer total shunting hooks. Compared with the branch and bound (B&B) algorithm, DQN takes less time to solve, and the number of freight train removed by the coupling and slipping operations is reduced by 5.3% and 2.9%, respectively, and the quality of the shunting operation plan is better. Therefore, this paper provides a new solution for the intelligentization of shunting operations in large freight train depot.
调车作业计划是货运列车段的主要日常工作,调车作业计划的优化对于提高铁路运营生产和运输效率具有重要意义。本文通过动作、状态和奖励三个要素构建调车作业问题的深度强化学习(DRL)环境和模型,以调车机车为智能体,以解体列车组的股道号为动作,以列车组的解体条件为状态,并基于列车组解体和重新编组后产生的调车钩总数设计奖励函数。采用深度Q网络(DQN)算法求解该模型,以调车钩数量最少为目标,经过充分训练后可求解出最优调车作业计划。通过实例仿真验证了DQN的有效性:与统筹规划(OPC)方法相比,DQN生成的调车作业计划占用股道更少,总调车钩数减少10%。与二叉搜索树(BST)算法相比,DQN的总调车钩数减少5%。与分支定界(B&B)算法相比,DQN求解时间更短,连挂和溜放作业解编的货运列车数量分别减少5.3%和2.9%,调车作业计划质量更好。因此,本文为大型货运列车段调车作业智能化提供了一种新的解决方案。