Hou Jing, Chen Guang, Li Zhijun, He Wei, Gu Shangding, Knoll Alois, Jiang Changjun
IEEE Trans Cybern. 2024 May;54(5):2771-2783. doi: 10.1109/TCYB.2023.3312647. Epub 2024 Apr 16.
Industries, such as manufacturing, are accelerating their embrace of the metaverse to achieve higher productivity, especially in complex industrial scheduling. In view of the growing parking challenges in large cities, high-density vehicle spatial scheduling is one of the potential solutions. Stack-based parking lots utilize parking robots to densely park vehicles in the vertical stacks like container stacking, which greatly reduces the aisle area in the parking lot, but requires complex scheduling algorithms to park and take out the vehicles. The existing high-density parking (HDP) scheduling algorithms are mainly heuristic methods, which only contain simple logic and are difficult to utilize information effectively. We propose a hybrid residual multiexpert (HIRE) reinforcement learning (RL) approach, a method for interactive learning in the digital industrial metaverse, which efficiently solves the HDP batch space scheduling problem. In our proposed framework, each heuristic scheduling method is considered as an expert. The neural network trained by RL assigns the expert strategy according to the current parking lot state. Furthermore, to avoid being limited by heuristic expert performance, the proposed hierarchical network framework also sets up a residual output channel. Experiments show that our proposed algorithm outperforms various advanced heuristic methods and the end-to-end RL method in the number of vehicle maneuvers, and has good robustness to the parking lot size and the estimation accuracy of vehicle exit time. We believe that the proposed HIRE RL method can be effectively and conveniently applied to practical application scenarios, which can be regarded as a key step for RL to enter the practical application stage of the industrial metaverse.
制造业等行业正在加速对元宇宙的应用,以提高生产力,尤其是在复杂的工业调度方面。鉴于大城市日益严峻的停车挑战,高密度车辆空间调度是一种潜在的解决方案。基于堆叠的停车场利用停车机器人将车辆像集装箱堆叠一样密集地停放在垂直堆叠中,这大大减少了停车场的过道面积,但需要复杂的调度算法来停车和取车。现有的高密度停车(HDP)调度算法主要是启发式方法,其逻辑简单,难以有效利用信息。我们提出了一种混合残差多专家(HIRE)强化学习(RL)方法,这是一种在数字工业元宇宙中进行交互式学习的方法,能有效解决HDP批量空间调度问题。在我们提出的框架中,每种启发式调度方法都被视为一个专家。通过强化学习训练的神经网络根据当前停车场状态分配专家策略。此外,为避免受启发式专家性能的限制,所提出的分层网络框架还设置了一个残差输出通道。实验表明,我们提出的算法在车辆操作次数方面优于各种先进的启发式方法和端到端强化学习方法,并且对停车场大小和车辆离开时间的估计精度具有良好的鲁棒性。我们相信,所提出的HIRE强化学习方法能够有效且方便地应用于实际应用场景,这可被视为强化学习进入工业元宇宙实际应用阶段的关键一步。