• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过深度强化学习和约束规划学习解决三维装箱问题

Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming.

作者信息

Jiang Yuan, Cao Zhiguang, Zhang Jie

出版信息

IEEE Trans Cybern. 2023 May;53(5):2864-2875. doi: 10.1109/TCYB.2021.3121542. Epub 2023 Apr 21.

DOI:10.1109/TCYB.2021.3121542
PMID:34748508
Abstract

Recently, there is a growing attention on applying deep reinforcement learning (DRL) to solve the 3-D bin packing problem (3-D BPP). However, due to the relatively less informative yet computationally heavy encoder, and considerably large action space inherent to the 3-D BPP, existing DRL methods are only able to handle up to 50 boxes. In this article, we propose to alleviate this issue via a DRL agent, which sequentially addresses three subtasks of sequence, orientation, and position, respectively. Specifically, we exploit a multimodal encoder, where a sparse attention subencoder embeds the box state to mitigate the computation while learning the packing policy, and a convolutional neural network subencoder embeds the view state to produce auxiliary spatial representation. We also leverage an action representation learning in the decoder to cope with the large action space of the position subtask. Besides, we integrate the proposed DRL agent into constraint programming (CP) to further improve the solution quality iteratively by exploiting the powerful search framework in CP. The experiments show that both the sole DRL and hybrid methods enable the agent to solve large-scale instances of 120 boxes or more. Moreover, they both could deliver superior performance against the baselines on instances of various scales.

摘要

最近,将深度强化学习(DRL)应用于解决三维装箱问题(3-D BPP)受到了越来越多的关注。然而,由于编码器信息相对较少但计算量较大,以及3-D BPP固有的相当大的动作空间,现有的DRL方法只能处理最多50个箱子。在本文中,我们提出通过一个DRL智能体来缓解这个问题,该智能体依次分别处理顺序、方向和位置这三个子任务。具体来说,我们利用一个多模态编码器,其中一个稀疏注意力子编码器在学习装箱策略时嵌入箱子状态以减轻计算量,一个卷积神经网络子编码器嵌入视图状态以生成辅助空间表示。我们还在解码器中利用动作表示学习来处理位置子任务的大动作空间。此外,我们将所提出的DRL智能体集成到约束编程(CP)中,通过利用CP中强大的搜索框架来迭代地进一步提高求解质量。实验表明,单独的DRL方法和混合方法都能使智能体解决120个或更多箱子的大规模实例。此外,在各种规模的实例上,它们相对于基线都能提供卓越的性能。

相似文献

1
Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming.通过深度强化学习和约束规划学习解决三维装箱问题
IEEE Trans Cybern. 2023 May;53(5):2864-2875. doi: 10.1109/TCYB.2021.3121542. Epub 2023 Apr 21.
2
Deep Reinforcement Learning for Solving Vehicle Routing Problems With Backhauls.
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4779-4793. doi: 10.1109/TNNLS.2024.3371781. Epub 2025 Feb 28.
3
Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem.用于解决异构容量车辆路径问题的深度强化学习
IEEE Trans Cybern. 2022 Dec;52(12):13572-13585. doi: 10.1109/TCYB.2021.3111082. Epub 2022 Nov 18.
4
A deep reinforcement learning algorithm for the rectangular strip packing problem.一种用于矩形带材打包问题的深度强化学习算法。
PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.
5
Sequence Decision Transformer for Adaptive Traffic Signal Control.用于自适应交通信号控制的序列决策变换器
Sensors (Basel). 2024 Sep 25;24(19):6202. doi: 10.3390/s24196202.
6
Deep Reinforcement Learning: A Survey.深度强化学习综述
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5064-5078. doi: 10.1109/TNNLS.2022.3207346. Epub 2024 Apr 4.
7
Double Sparse Deep Reinforcement Learning via Multilayer Sparse Coding and Nonconvex Regularized Pruning.基于多层稀疏编码和非凸正则化剪枝的双稀疏深度强化学习
IEEE Trans Cybern. 2023 Feb;53(2):765-778. doi: 10.1109/TCYB.2022.3157892. Epub 2023 Jan 13.
8
Space-Air-Ground Integrated Mobile Crowdsensing for Partially Observable Data Collection by Multi-Scale Convolutional Graph Reinforcement Learning.基于多尺度卷积图强化学习的空间-空中-地面集成移动群体感知用于部分可观测数据收集
Entropy (Basel). 2022 May 1;24(5):638. doi: 10.3390/e24050638.
9
DVNE-DRL: dynamic virtual network embedding algorithm based on deep reinforcement learning.DVNE-DRL:基于深度强化学习的动态虚拟网络嵌入算法
Sci Rep. 2023 Nov 13;13(1):19789. doi: 10.1038/s41598-023-47195-5.
10
Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.将启发式方法与深度强化学习相结合用于在线3D装箱优化
Sensors (Basel). 2024 Aug 20;24(16):5370. doi: 10.3390/s24165370.

引用本文的文献

1
A deep reinforcement learning algorithm for the rectangular strip packing problem.一种用于矩形带材打包问题的深度强化学习算法。
PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.