Wang Runqing, Wang Gang, Sun Jian, Deng Fang, Chen Jie
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3091-3102. doi: 10.1109/TNNLS.2023.3306421. Epub 2024 Feb 29.
Flexible manufacturing has given rise to complex scheduling problems such as the flexible job shop scheduling problem (FJSP). In FJSP, operations can be processed on multiple machines, leading to intricate relationships between operations and machines. Recent works have employed deep reinforcement learning (DRL) to learn priority dispatching rules (PDRs) for solving FJSP. However, the quality of solutions still has room for improvement relative to that by the exact methods such as OR-Tools. To address this issue, this article presents a novel end-to-end learning framework that weds the merits of self-attention models for deep feature extraction and DRL for scalable decision-making. The complex relationships between operations and machines are represented precisely and concisely, for which a dual-attention network (DAN) comprising several interconnected operation message attention blocks and machine message attention blocks is proposed. The DAN exploits the complicated relationships to construct production-adaptive operation and machine features to support high-quality decision-making. Experimental results using synthetic data as well as public benchmarks corroborate that the proposed approach outperforms both traditional PDRs and the state-of-the-art DRL method. Moreover, it achieves results comparable to exact methods in certain cases and demonstrates favorable generalization ability to large-scale and real-world unseen FJSP tasks.
柔性制造引发了诸如柔性作业车间调度问题(FJSP)等复杂的调度问题。在FJSP中,工序可以在多台机器上加工,导致工序与机器之间存在复杂的关系。最近的研究工作采用深度强化学习(DRL)来学习优先级调度规则(PDR)以解决FJSP。然而,相对于诸如OR-Tools等精确方法,解决方案的质量仍有提升空间。为了解决这个问题,本文提出了一种新颖的端到端学习框架,该框架结合了用于深度特征提取的自注意力模型和用于可扩展决策的DRL的优点。精确而简洁地表示了工序与机器之间的复杂关系,为此提出了一种由几个相互连接的工序消息注意力块和机器消息注意力块组成的双注意力网络(DAN)。DAN利用这些复杂关系来构建适应生产的工序和机器特征,以支持高质量决策。使用合成数据以及公开基准的实验结果证实,所提出的方法优于传统的PDR和当前最先进的DRL方法。此外,在某些情况下它能取得与精确方法相当的结果,并在大规模和现实世界中未见的FJSP任务上展现出良好的泛化能力。