• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于矩形带材打包问题的深度强化学习算法。

A deep reinforcement learning algorithm for the rectangular strip packing problem.

机构信息

State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China.

Laboratory of Multimedia Stream Computing and Storage, Huazhong University of Science and Technology, Wuhan, China.

出版信息

PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.

DOI:10.1371/journal.pone.0282598
PMID:36928505
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10019708/
Abstract

As a branch of the two-dimensional (2D) optimal blanking problem, rectangular strip packing is a typical non-deterministic polynomial (NP-hard) problem. The classical packing solution method relies on heuristic and metaheuristic algorithms. Usually, it needs to be designed with manual decisions to guide the solution, resulting in a small solution scale, weak generalization, and low solution efficiency. Inspired by deep learning and reinforcement learning, combined with the characteristics of rectangular piece packing, a novel algorithm based on deep reinforcement learning is proposed in this work to solve the rectangular strip packing problem. The pointer network with an encoder and decoder structure is taken as the basic network for the deep reinforcement learning algorithm. A model-free reinforcement learning algorithm is designed to train network parameters to optimize the packing sequence. This design can not only avoid designing heuristic rules separately for different problems but also use the deep networks with self-learning characteristics to solve different instances more widely. At the same time, a piece positioning algorithm based on the maximum rectangles bottom-left (Maxrects-BL) is designed to determine the placement position of pieces on the plate and calculate model rewards and packing parameters. Finally, instances are used to analyze the optimization effect of the algorithm. The experimental results show that the proposed algorithm can produce three better and five comparable results compared with some classical heuristic algorithms. In addition, the calculation time of the proposed algorithm is less than 1 second in all test instances, which shows a good generalization, solution efficiency, and practical application potential.

摘要

作为二维(2D)优化下料问题的一个分支,矩形条带排样是一个典型的非确定性多项式(NP-hard)问题。经典的排样解决方案方法依赖于启发式和元启发式算法。通常,它需要设计手动决策来指导解决方案,导致解决方案规模小、通用性弱、解决方案效率低。受深度学习和强化学习的启发,结合矩形件排样的特点,本文提出了一种基于深度强化学习的新算法来解决矩形条带排样问题。带有编码器和解码器结构的指针网络被用作深度强化学习算法的基本网络。设计了一种无模型强化学习算法来训练网络参数以优化包装顺序。这种设计不仅可以避免为不同的问题分别设计启发式规则,还可以利用具有自我学习特性的深度网络更广泛地解决不同的实例。同时,设计了一种基于最大矩形左下角(Maxrects-BL)的片定位算法来确定片在板上的放置位置,并计算模型奖励和包装参数。最后,使用实例分析算法的优化效果。实验结果表明,与一些经典启发式算法相比,所提出的算法可以产生三个更好的结果和五个可比较的结果。此外,在所测试的所有实例中,所提出的算法的计算时间都小于 1 秒,这表明它具有良好的泛化能力、解决方案效率和实际应用潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/a5af70ef5491/pone.0282598.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/f00255100761/pone.0282598.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/c98dd2a0dcef/pone.0282598.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/730e23eb52eb/pone.0282598.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/5af2bec10e42/pone.0282598.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/a4bb6e60df82/pone.0282598.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/02a1a910f6ed/pone.0282598.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/9a1a667ed086/pone.0282598.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/a5af70ef5491/pone.0282598.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/f00255100761/pone.0282598.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/c98dd2a0dcef/pone.0282598.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/730e23eb52eb/pone.0282598.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/5af2bec10e42/pone.0282598.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/a4bb6e60df82/pone.0282598.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/02a1a910f6ed/pone.0282598.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/9a1a667ed086/pone.0282598.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/345a/10019708/a5af70ef5491/pone.0282598.g008.jpg

相似文献

1
A deep reinforcement learning algorithm for the rectangular strip packing problem.一种用于矩形带材打包问题的深度强化学习算法。
PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.
2
An efficient constructive heuristic for the rectangular packing problem with rotations.一种用于带旋转的矩形布局问题的有效构造启发式算法。
PLoS One. 2023 Dec 28;18(12):e0295206. doi: 10.1371/journal.pone.0295206. eCollection 2023.
3
A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation.基于特征变换的求解多目标旅行商问题的深度强化学习算法框架。
Neural Netw. 2024 Aug;176:106359. doi: 10.1016/j.neunet.2024.106359. Epub 2024 May 3.
4
Distributed deep reinforcement learning based on bi-objective framework for multi-robot formation.基于双目标框架的多机器人编队分布式深度强化学习
Neural Netw. 2024 Mar;171:61-72. doi: 10.1016/j.neunet.2023.11.063. Epub 2023 Dec 1.
5
Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.将启发式方法与深度强化学习相结合用于在线3D装箱优化
Sensors (Basel). 2024 Aug 20;24(16):5370. doi: 10.3390/s24165370.
6
Application of Deep Reinforcement Learning Algorithm in Uncertain Logistics Transportation Scheduling.深度强化学习算法在不确定物流运输调度中的应用。
Comput Intell Neurosci. 2021 Sep 25;2021:5672227. doi: 10.1155/2021/5672227. eCollection 2021.
7
Dynamic sub-route-based self-adaptive beam search Q-learning algorithm for traveling salesman problem.基于动态子路径的自适应束搜索 Q 学习算法求解旅行商问题。
PLoS One. 2023 Mar 21;18(3):e0283207. doi: 10.1371/journal.pone.0283207. eCollection 2023.
8
Deep Reinforcement Learning with Local Attention for Single Agile Optical Satellite Scheduling Problem.基于局部注意力的深度强化学习求解单颗敏捷光学卫星调度问题
Sensors (Basel). 2024 Oct 2;24(19):6396. doi: 10.3390/s24196396.
9
Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning.基于启发式强化学习的无人机机动决策方法研究。
Comput Intell Neurosci. 2022 Mar 3;2022:1477078. doi: 10.1155/2022/1477078. eCollection 2022.
10
Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。
PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.

引用本文的文献

1
Impact of minimum distance constraints on sheet metal waste for plasma cutting.最小距离约束对等离子切割钣金废料的影响。
PLoS One. 2023 Sep 27;18(9):e0292032. doi: 10.1371/journal.pone.0292032. eCollection 2023.
2
The machining torch movement for the rectangular plasma sheet metal cut.矩形等离子板材切割的加工炬运动。
PLoS One. 2023 Sep 14;18(9):e0291184. doi: 10.1371/journal.pone.0291184. eCollection 2023.

本文引用的文献

1
Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming.通过深度强化学习和约束规划学习解决三维装箱问题
IEEE Trans Cybern. 2023 May;53(5):2864-2875. doi: 10.1109/TCYB.2021.3121542. Epub 2023 Apr 21.
2
Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem.用于解决异构容量车辆路径问题的深度强化学习
IEEE Trans Cybern. 2022 Dec;52(12):13572-13585. doi: 10.1109/TCYB.2021.3111082. Epub 2022 Nov 18.
3
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.