Wong Ching-Chang, Yeh Li-Yu, Liu Chih-Cheng, Tsai Chi-Yi, Aoyama Hisasuki
Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 25137, Taiwan.
Department of Mechanical and Intelligent Systems Engineering, University of Electro-Communications, Tokyo 182-8585, Japan.
Sensors (Basel). 2021 Mar 24;21(7):2280. doi: 10.3390/s21072280.
In this paper, a manipulation planning method for object re-orientation based on semantic segmentation keypoint detection is proposed for robot manipulator which is able to detect and re-orientate the randomly placed objects to a specified position and pose. There are two main parts: (1) 3D keypoint detection system; and (2) manipulation planning system for object re-orientation. In the 3D keypoint detection system, an RGB-D camera is used to obtain the information of the environment and can generate 3D keypoints of the target object as inputs to represent its corresponding position and pose. This process simplifies the 3D model representation so that the manipulation planning for object re-orientation can be executed in a category-level manner by adding various training data of the object in the training phase. In addition, 3D suction points in both the object's current and expected poses are also generated as the inputs of the next operation stage. During the next stage, Mask Region-Convolutional Neural Network (Mask R-CNN) algorithm is used for preliminary object detection and object image. The highest confidence index image is selected as the input of the semantic segmentation system in order to classify each pixel in the picture for the corresponding pack unit of the object. In addition, after using a convolutional neural network for semantic segmentation, the Conditional Random Fields (CRFs) method is used to perform several iterations to obtain a more accurate result of object recognition. When the target object is segmented into the pack units of image process, the center position of each pack unit can be obtained. Then, a normal vector of each pack unit's center points is generated by the depth image information and pose of the object, which can be obtained by connecting the center points of each pack unit. In the manipulation planning system for object re-orientation, the pose of the object and the normal vector of each pack unit are first converted into the working coordinate system of the robot manipulator. Then, according to the current and expected pose of the object, the spherical linear interpolation (Slerp) algorithm is used to generate a series of movements in the workspace for object re-orientation on the robot manipulator. In addition, the pose of the object is adjusted on the z-axis of the object's geodetic coordinate system based on the image features on the surface of the object, so that the pose of the placed object can approach the desired pose. Finally, a robot manipulator and a vacuum suction cup made by the laboratory are used to verify that the proposed system can indeed complete the planned task of object re-orientation.
本文针对机器人操纵器提出了一种基于语义分割关键点检测的物体重新定向操纵规划方法,该方法能够将随机放置的物体检测并重新定向到指定的位置和姿态。主要有两个部分:(1)三维关键点检测系统;(2)物体重新定向操纵规划系统。在三维关键点检测系统中,使用RGB-D相机获取环境信息,并生成目标物体的三维关键点作为输入,以表示其相应的位置和姿态。这一过程简化了三维模型表示,以便在训练阶段通过添加物体的各种训练数据,以类别级别的方式执行物体重新定向的操纵规划。此外,还生成物体当前姿态和期望姿态下的三维吸附点,作为下一操作阶段的输入。在下一阶段,使用掩膜区域卷积神经网络(Mask R-CNN)算法进行初步的物体检测和物体图像获取。选择置信度指数最高的图像作为语义分割系统的输入,以便对图片中的每个像素进行分类,对应物体的各个包装单元。此外,在使用卷积神经网络进行语义分割后,使用条件随机场(CRF)方法进行多次迭代,以获得更准确的物体识别结果。当目标物体被分割成图像处理的各个包装单元时,可以获得每个包装单元的中心位置。然后,通过深度图像信息和物体姿态生成每个包装单元中心点的法向量,物体姿态可通过连接每个包装单元的中心点获得。在物体重新定向操纵规划系统中,首先将物体的姿态和每个包装单元的法向量转换为机器人操纵器的工作坐标系。然后,根据物体的当前姿态和期望姿态,使用球面线性插值(Slerp)算法在工作空间中生成一系列运动,用于机器人操纵器上的物体重新定向。此外,基于物体表面的图像特征,在物体大地坐标系的z轴上调整物体的姿态,使放置物体的姿态能够接近期望姿态。最后,使用实验室制作的机器人操纵器和真空吸盘来验证所提出的系统确实能够完成物体重新定向的规划任务。