Suppr超能文献

一种用于矩形带材打包问题的深度强化学习算法。

A deep reinforcement learning algorithm for the rectangular strip packing problem.

机构信息

State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China.

Laboratory of Multimedia Stream Computing and Storage, Huazhong University of Science and Technology, Wuhan, China.

出版信息

PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.

Abstract

As a branch of the two-dimensional (2D) optimal blanking problem, rectangular strip packing is a typical non-deterministic polynomial (NP-hard) problem. The classical packing solution method relies on heuristic and metaheuristic algorithms. Usually, it needs to be designed with manual decisions to guide the solution, resulting in a small solution scale, weak generalization, and low solution efficiency. Inspired by deep learning and reinforcement learning, combined with the characteristics of rectangular piece packing, a novel algorithm based on deep reinforcement learning is proposed in this work to solve the rectangular strip packing problem. The pointer network with an encoder and decoder structure is taken as the basic network for the deep reinforcement learning algorithm. A model-free reinforcement learning algorithm is designed to train network parameters to optimize the packing sequence. This design can not only avoid designing heuristic rules separately for different problems but also use the deep networks with self-learning characteristics to solve different instances more widely. At the same time, a piece positioning algorithm based on the maximum rectangles bottom-left (Maxrects-BL) is designed to determine the placement position of pieces on the plate and calculate model rewards and packing parameters. Finally, instances are used to analyze the optimization effect of the algorithm. The experimental results show that the proposed algorithm can produce three better and five comparable results compared with some classical heuristic algorithms. In addition, the calculation time of the proposed algorithm is less than 1 second in all test instances, which shows a good generalization, solution efficiency, and practical application potential.

摘要

作为二维(2D)优化下料问题的一个分支,矩形条带排样是一个典型的非确定性多项式(NP-hard)问题。经典的排样解决方案方法依赖于启发式和元启发式算法。通常,它需要设计手动决策来指导解决方案,导致解决方案规模小、通用性弱、解决方案效率低。受深度学习和强化学习的启发,结合矩形件排样的特点,本文提出了一种基于深度强化学习的新算法来解决矩形条带排样问题。带有编码器和解码器结构的指针网络被用作深度强化学习算法的基本网络。设计了一种无模型强化学习算法来训练网络参数以优化包装顺序。这种设计不仅可以避免为不同的问题分别设计启发式规则,还可以利用具有自我学习特性的深度网络更广泛地解决不同的实例。同时,设计了一种基于最大矩形左下角(Maxrects-BL)的片定位算法来确定片在板上的放置位置,并计算模型奖励和包装参数。最后,使用实例分析算法的优化效果。实验结果表明,与一些经典启发式算法相比,所提出的算法可以产生三个更好的结果和五个可比较的结果。此外,在所测试的所有实例中,所提出的算法的计算时间都小于 1 秒,这表明它具有良好的泛化能力、解决方案效率和实际应用潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/f00255100761/pone.0282598.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验