• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将启发式方法与深度强化学习相结合用于在线3D装箱优化

Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.

作者信息

Wong Ching-Chang, Tsai Tai-Ting, Ou Can-Kun

机构信息

Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 25137, Taiwan.

出版信息

Sensors (Basel). 2024 Aug 20;24(16):5370. doi: 10.3390/s24165370.

DOI:10.3390/s24165370
PMID:39205064
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11358981/
Abstract

This study proposes a method named Hybrid Heuristic Proximal Policy Optimization (HHPPO) to implement online 3D bin-packing tasks. Some heuristic algorithms for bin-packing and the Proximal Policy Optimization (PPO) algorithm of deep reinforcement learning are integrated to implement this method. In the heuristic algorithms for bin-packing, an extreme point priority sorting method is proposed to sort the generated extreme points according to their waste spaces to improve space utilization. In addition, a 3D grid representation of the space status of the container is used, and some partial support constraints are proposed to increase the possibilities for stacking objects and enhance overall space utilization. In the PPO algorithm, some heuristic algorithms are integrated, and the reward function and the action space of the policy network are designed so that the proposed method can effectively complete the online 3D bin-packing task. Some experimental results illustrate that the proposed method has good results in achieving online 3D bin-packing tasks in some simulation environments. In addition, an environment with image vision is constructed to show that the proposed method indeed enables an actual robot manipulator to successfully and effectively complete the bin-packing task in a real environment.

摘要

本研究提出了一种名为混合启发式近端策略优化(HHPPO)的方法来实现在线3D装箱任务。该方法集成了一些装箱启发式算法和深度强化学习的近端策略优化(PPO)算法。在装箱启发式算法中,提出了一种极点优先级排序方法,根据生成的极点的浪费空间对其进行排序,以提高空间利用率。此外,使用了容器空间状态的3D网格表示,并提出了一些部分支撑约束,以增加物体堆叠的可能性并提高整体空间利用率。在PPO算法中,集成了一些启发式算法,并设计了奖励函数和策略网络的动作空间,以使所提出的方法能够有效地完成在线3D装箱任务。一些实验结果表明,该方法在某些模拟环境中实现在线3D装箱任务时具有良好的效果。此外,构建了一个具有图像视觉的环境,以表明所提出的方法确实能够使实际的机器人操纵器在真实环境中成功有效地完成装箱任务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/f02fd5e5de36/sensors-24-05370-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/045253196463/sensors-24-05370-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/2e240eef7a42/sensors-24-05370-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/bfd8b3052ff9/sensors-24-05370-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/a666d83a7b78/sensors-24-05370-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/5b33e3bd3f87/sensors-24-05370-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/dc1f6009d8af/sensors-24-05370-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/4ac3d927bc2e/sensors-24-05370-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/409e4a0727b9/sensors-24-05370-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/d6f91ac7c8ba/sensors-24-05370-g009a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/b5dec3aae66a/sensors-24-05370-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/e32d13d1c53f/sensors-24-05370-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/f02fd5e5de36/sensors-24-05370-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/045253196463/sensors-24-05370-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/2e240eef7a42/sensors-24-05370-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/bfd8b3052ff9/sensors-24-05370-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/a666d83a7b78/sensors-24-05370-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/5b33e3bd3f87/sensors-24-05370-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/dc1f6009d8af/sensors-24-05370-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/4ac3d927bc2e/sensors-24-05370-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/409e4a0727b9/sensors-24-05370-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/d6f91ac7c8ba/sensors-24-05370-g009a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/b5dec3aae66a/sensors-24-05370-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/e32d13d1c53f/sensors-24-05370-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7fa/11358981/f02fd5e5de36/sensors-24-05370-g012.jpg

相似文献

1
Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.将启发式方法与深度强化学习相结合用于在线3D装箱优化
Sensors (Basel). 2024 Aug 20;24(16):5370. doi: 10.3390/s24165370.
2
A deep reinforcement learning algorithm for the rectangular strip packing problem.一种用于矩形带材打包问题的深度强化学习算法。
PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.
3
Optimizing e-commerce warehousing through open dimension management in a three-dimensional bin packing system.通过三维装箱系统中的开放维度管理优化电子商务仓储
PeerJ Comput Sci. 2023 Oct 9;9:e1613. doi: 10.7717/peerj-cs.1613. eCollection 2023.
4
A GAN-based genetic algorithm for solving the 3D bin packing problem.一种基于生成对抗网络的遗传算法,用于解决三维装箱问题。
Sci Rep. 2024 Apr 2;14(1):7775. doi: 10.1038/s41598-024-56699-7.
5
Smart Pack: Online Autonomous Object-Packing System Using RGB-D Sensor Data.智能包装:使用 RGB-D 传感器数据的在线自主目标包装系统。
Sensors (Basel). 2020 Aug 9;20(16):4448. doi: 10.3390/s20164448.
6
BoxStacker: Deep Reinforcement Learning for 3D Bin Packing Problem in Virtual Environment of Logistics Systems.BoxStacker:物流系统虚拟环境中三维装箱问题的深度强化学习
Sensors (Basel). 2023 Aug 3;23(15):6928. doi: 10.3390/s23156928.
7
QAL-BP: an augmented Lagrangian quantum approach for bin packing.QAL-BP:一种用于装箱问题的增强拉格朗日量子方法。
Sci Rep. 2024 Mar 1;14(1):5142. doi: 10.1038/s41598-023-50540-3.
8
Automating the packing heuristic design process with genetic programming.使用遗传编程自动化包装启发式设计过程。
Evol Comput. 2012 Spring;20(1):63-89. doi: 10.1162/EVCO_a_00044. Epub 2011 Dec 2.
9
Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning.基于启发式强化学习的无人机机动决策方法研究。
Comput Intell Neurosci. 2022 Mar 3;2022:1477078. doi: 10.1155/2022/1477078. eCollection 2022.
10
Adaptive Discount Factor for Deep Reinforcement Learning in Continuing Tasks with Uncertainty.具有不确定性的持续任务中用于深度强化学习的自适应折扣因子。
Sensors (Basel). 2022 Sep 25;22(19):7266. doi: 10.3390/s22197266.

本文引用的文献

1
BoxStacker: Deep Reinforcement Learning for 3D Bin Packing Problem in Virtual Environment of Logistics Systems.BoxStacker:物流系统虚拟环境中三维装箱问题的深度强化学习
Sensors (Basel). 2023 Aug 3;23(15):6928. doi: 10.3390/s23156928.