基于语义分割关键点检测的物体重新定向操作规划

Wong Ching-Chang, Yeh Li-Yu, Liu Chih-Cheng, Tsai Chi-Yi, Aoyama Hisasuki

Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 25137, Taiwan.

Department of Mechanical and Intelligent Systems Engineering, University of Electro-Communications, Tokyo 182-8585, Japan.

Sensors (Basel). 2021 Mar 24;21(7):2280. doi: 10.3390/s21072280.

In this paper, a manipulation planning method for object re-orientation based on semantic segmentation keypoint detection is proposed for robot manipulator which is able to detect and re-orientate the randomly placed objects to a specified position and pose. There are two main parts: (1) 3D keypoint detection system; and (2) manipulation planning system for object re-orientation. In the 3D keypoint detection system, an RGB-D camera is used to obtain the information of the environment and can generate 3D keypoints of the target object as inputs to represent its corresponding position and pose. This process simplifies the 3D model representation so that the manipulation planning for object re-orientation can be executed in a category-level manner by adding various training data of the object in the training phase. In addition, 3D suction points in both the object's current and expected poses are also generated as the inputs of the next operation stage. During the next stage, Mask Region-Convolutional Neural Network (Mask R-CNN) algorithm is used for preliminary object detection and object image. The highest confidence index image is selected as the input of the semantic segmentation system in order to classify each pixel in the picture for the corresponding pack unit of the object. In addition, after using a convolutional neural network for semantic segmentation, the Conditional Random Fields (CRFs) method is used to perform several iterations to obtain a more accurate result of object recognition. When the target object is segmented into the pack units of image process, the center position of each pack unit can be obtained. Then, a normal vector of each pack unit's center points is generated by the depth image information and pose of the object, which can be obtained by connecting the center points of each pack unit. In the manipulation planning system for object re-orientation, the pose of the object and the normal vector of each pack unit are first converted into the working coordinate system of the robot manipulator. Then, according to the current and expected pose of the object, the spherical linear interpolation (Slerp) algorithm is used to generate a series of movements in the workspace for object re-orientation on the robot manipulator. In addition, the pose of the object is adjusted on the z-axis of the object's geodetic coordinate system based on the image features on the surface of the object, so that the pose of the placed object can approach the desired pose. Finally, a robot manipulator and a vacuum suction cup made by the laboratory are used to verify that the proposed system can indeed complete the planned task of object re-orientation.

本文针对机器人操纵器提出了一种基于语义分割关键点检测的物体重新定向操纵规划方法，该方法能够将随机放置的物体检测并重新定向到指定的位置和姿态。主要有两个部分：（1）三维关键点检测系统；（2）物体重新定向操纵规划系统。在三维关键点检测系统中，使用RGB-D相机获取环境信息，并生成目标物体的三维关键点作为输入，以表示其相应的位置和姿态。这一过程简化了三维模型表示，以便在训练阶段通过添加物体的各种训练数据，以类别级别的方式执行物体重新定向的操纵规划。此外，还生成物体当前姿态和期望姿态下的三维吸附点，作为下一操作阶段的输入。在下一阶段，使用掩膜区域卷积神经网络（Mask R-CNN）算法进行初步的物体检测和物体图像获取。选择置信度指数最高的图像作为语义分割系统的输入，以便对图片中的每个像素进行分类，对应物体的各个包装单元。此外，在使用卷积神经网络进行语义分割后，使用条件随机场（CRF）方法进行多次迭代，以获得更准确的物体识别结果。当目标物体被分割成图像处理的各个包装单元时，可以获得每个包装单元的中心位置。然后，通过深度图像信息和物体姿态生成每个包装单元中心点的法向量，物体姿态可通过连接每个包装单元的中心点获得。在物体重新定向操纵规划系统中，首先将物体的姿态和每个包装单元的法向量转换为机器人操纵器的工作坐标系。然后，根据物体的当前姿态和期望姿态，使用球面线性插值（Slerp）算法在工作空间中生成一系列运动，用于机器人操纵器上的物体重新定向。此外，基于物体表面的图像特征，在物体大地坐标系的z轴上调整物体的姿态，使放置物体的姿态能够接近期望姿态。最后，使用实验室制作的机器人操纵器和真空吸盘来验证所提出的系统确实能够完成物体重新定向的规划任务。

相似文献

Manipulation Planning for Object Re-Orientation Based on Semantic Segmentation Keypoint Detection.

Sensors (Basel). 2021 Mar 24;21(7):2280. doi: 10.3390/s21072280.

Multi-Channel Convolutional Neural Network Based 3D Object Detection for Indoor Robot Environmental Perception.

Sensors (Basel). 2019 Feb 21;19(4):893. doi: 10.3390/s19040893.

A Manufacturing-Oriented Intelligent Vision System Based on Deep Neural Network for Object Recognition and 6D Pose Estimation.

Front Neurorobot. 2021 Jan 7;14:616775. doi: 10.3389/fnbot.2020.616775. eCollection 2020.

Marker-Less 3d Object Recognition and 6d Pose Estimation for Homogeneous Textureless Objects: An RGB-D Approach.

Sensors (Basel). 2020 Sep 7;20(18):5098. doi: 10.3390/s20185098.

Research on Multi-Object Sorting System Based on Deep Learning.

Sensors (Basel). 2021 Sep 17;21(18):6238. doi: 10.3390/s21186238.

RGB-D-Based Pose Estimation of Workpieces with Semantic Segmentation and Point Cloud Registration.

Sensors (Basel). 2019 Apr 19;19(8):1873. doi: 10.3390/s19081873.

Instance-level 6D pose estimation based on multi-task parameter sharing for robotic grasping.

Sci Rep. 2024 Apr 2;14(1):7801. doi: 10.1038/s41598-024-58590-x.

Implementation of a Real-Time Object Pick-and-Place System Based on a Changing Strategy for Rapidly-Exploring Random Tree.

Sensors (Basel). 2023 May 16;23(10):4814. doi: 10.3390/s23104814.

Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators.

Sensors (Basel). 2024 Jan 10;24(2):432. doi: 10.3390/s24020432.

FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything.

Sensors (Basel). 2024 Apr 30;24(9):2889. doi: 10.3390/s24092889.

引用本文的文献

Multi-scale feature fusion keypoint detection network for ship draft line localization.

Sci Rep. 2025 Jul 21;15(1):26397. doi: 10.1038/s41598-025-10594-x.

Implementation of a Real-Time Object Pick-and-Place System Based on a Changing Strategy for Rapidly-Exploring Random Tree.

Sensors (Basel). 2023 May 16;23(10):4814. doi: 10.3390/s23104814.

Exploring a Novel Multiple-Query Resistive Grid-Based Planning Method Applied to High-DOF Robotic Manipulators.

Sensors (Basel). 2021 May 10;21(9):3274. doi: 10.3390/s21093274.

本文引用的文献

Depth Image-Based Deep Learning of Grasp Planning for Textureless Planar-Faced Objects in Vision-Guided Robotic Bin-Picking.

Sensors (Basel). 2020 Jan 28;20(3):706. doi: 10.3390/s20030706.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Fully Convolutional Networks for Semantic Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Manipulation Planning for Object Re-Orientation Based on Semantic Segmentation Keypoint Detection.

Sensors (Basel). 2021 Mar 24;21(7):2280. doi: 10.3390/s21072280.

Multi-Channel Convolutional Neural Network Based 3D Object Detection for Indoor Robot Environmental Perception.

Sensors (Basel). 2019 Feb 21;19(4):893. doi: 10.3390/s19040893.

A Manufacturing-Oriented Intelligent Vision System Based on Deep Neural Network for Object Recognition and 6D Pose Estimation.

Front Neurorobot. 2021 Jan 7;14:616775. doi: 10.3389/fnbot.2020.616775. eCollection 2020.

Marker-Less 3d Object Recognition and 6d Pose Estimation for Homogeneous Textureless Objects: An RGB-D Approach.

Sensors (Basel). 2020 Sep 7;20(18):5098. doi: 10.3390/s20185098.

Research on Multi-Object Sorting System Based on Deep Learning.

Sensors (Basel). 2021 Sep 17;21(18):6238. doi: 10.3390/s21186238.

RGB-D-Based Pose Estimation of Workpieces with Semantic Segmentation and Point Cloud Registration.

Sensors (Basel). 2019 Apr 19;19(8):1873. doi: 10.3390/s19081873.

Instance-level 6D pose estimation based on multi-task parameter sharing for robotic grasping.

Sci Rep. 2024 Apr 2;14(1):7801. doi: 10.1038/s41598-024-58590-x.

Implementation of a Real-Time Object Pick-and-Place System Based on a Changing Strategy for Rapidly-Exploring Random Tree.

Sensors (Basel). 2023 May 16;23(10):4814. doi: 10.3390/s23104814.

Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators.

Sensors (Basel). 2024 Jan 10;24(2):432. doi: 10.3390/s24020432.

FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything.

Sensors (Basel). 2024 Apr 30;24(9):2889. doi: 10.3390/s24092889.

引用本文的文献

Multi-scale feature fusion keypoint detection network for ship draft line localization.

Sci Rep. 2025 Jul 21;15(1):26397. doi: 10.1038/s41598-025-10594-x.

Implementation of a Real-Time Object Pick-and-Place System Based on a Changing Strategy for Rapidly-Exploring Random Tree.

Sensors (Basel). 2023 May 16;23(10):4814. doi: 10.3390/s23104814.

Exploring a Novel Multiple-Query Resistive Grid-Based Planning Method Applied to High-DOF Robotic Manipulators.

Sensors (Basel). 2021 May 10;21(9):3274. doi: 10.3390/s21093274.

本文引用的文献

Depth Image-Based Deep Learning of Grasp Planning for Textureless Planar-Faced Objects in Vision-Guided Robotic Bin-Picking.

Sensors (Basel). 2020 Jan 28;20(3):706. doi: 10.3390/s20030706.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Fully Convolutional Networks for Semantic Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.

Manipulation Planning for Object Re-Orientation Based on Semantic Segmentation Keypoint Detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献