将启发式方法与深度强化学习相结合用于在线3D装箱优化

Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.

作者信息

Wong Ching-Chang, Tsai Tai-Ting, Ou Can-Kun

机构信息

Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 25137, Taiwan.

出版信息

Sensors (Basel). 2024 Aug 20;24(16):5370. doi: 10.3390/s24165370.

DOI:10.3390/s24165370

PMID:39205064

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11358981/

Abstract

This study proposes a method named Hybrid Heuristic Proximal Policy Optimization (HHPPO) to implement online 3D bin-packing tasks. Some heuristic algorithms for bin-packing and the Proximal Policy Optimization (PPO) algorithm of deep reinforcement learning are integrated to implement this method. In the heuristic algorithms for bin-packing, an extreme point priority sorting method is proposed to sort the generated extreme points according to their waste spaces to improve space utilization. In addition, a 3D grid representation of the space status of the container is used, and some partial support constraints are proposed to increase the possibilities for stacking objects and enhance overall space utilization. In the PPO algorithm, some heuristic algorithms are integrated, and the reward function and the action space of the policy network are designed so that the proposed method can effectively complete the online 3D bin-packing task. Some experimental results illustrate that the proposed method has good results in achieving online 3D bin-packing tasks in some simulation environments. In addition, an environment with image vision is constructed to show that the proposed method indeed enables an actual robot manipulator to successfully and effectively complete the bin-packing task in a real environment.

摘要

本研究提出了一种名为混合启发式近端策略优化（HHPPO）的方法来实现在线3D装箱任务。该方法集成了一些装箱启发式算法和深度强化学习的近端策略优化（PPO）算法。在装箱启发式算法中，提出了一种极点优先级排序方法，根据生成的极点的浪费空间对其进行排序，以提高空间利用率。此外，使用了容器空间状态的3D网格表示，并提出了一些部分支撑约束，以增加物体堆叠的可能性并提高整体空间利用率。在PPO算法中，集成了一些启发式算法，并设计了奖励函数和策略网络的动作空间，以使所提出的方法能够有效地完成在线3D装箱任务。一些实验结果表明，该方法在某些模拟环境中实现在线3D装箱任务时具有良好的效果。此外，构建了一个具有图像视觉的环境，以表明所提出的方法确实能够使实际的机器人操纵器在真实环境中成功有效地完成装箱任务。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

将启发式方法与深度强化学习相结合用于在线3D装箱优化

Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

将启发式方法与深度强化学习相结合用于在线3D装箱优化

Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization.

作者信息

机构信息

出版信息

相似文献

本文引用的文献