GR-ConvNet v2：一种用于机器人抓取的实时多抓取检测网络。

GR-ConvNet v2: A Real-Time Multi-Grasp Detection Network for Robotic Grasping.

机构信息

The Department of Electrical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.

eBots Inc., Fremont, CA 94539, USA.

出版信息

Sensors (Basel). 2022 Aug 18;22(16):6208. doi: 10.3390/s22166208.

DOI:10.3390/s22166208

PMID:36015978

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9415764/

Abstract

We propose a dual-module robotic system to tackle the problem of generating and performing antipodal robotic grasps for unknown objects from the n-channel image of the scene. We present an improved version of the Generative Residual Convolutional Neural Network (GR-ConvNet v2) model that can generate robust antipodal grasps from n-channel image input at real-time speeds (20 ms). We evaluated the proposed model architecture on three standard datasets and achieved a new state-of-the-art accuracy of 98.8%, 95.1%, and 97.4% on Cornell, Jacquard and Graspnet grasping datasets, respectively. Empirical results show that our model significantly outperformed the prior work with a stricter IoU-based grasp detection metric. We conducted a suite of tests in simulation and the real world on a diverse set of previously unseen objects with adversarial geometry and household items. We demonstrate the adaptability of our approach by directly transferring the trained model to a 7 DoF robotic manipulator with a grasp success rate of 95.4% and 93.0% on novel household and adversarial objects, respectively. Furthermore, we validate the generalization capability of our pixel-wise grasp prediction model by validating it on complex Ravens-10 benchmark tasks, some of which require closed-loop visual feedback for multi-step sequencing.

摘要

我们提出了一种双模块机器人系统，以解决从场景的 n 通道图像生成和执行未知物体的对掌机器人抓取的问题。我们提出了一种改进的生成残差卷积神经网络 (GR-ConvNet v2) 模型，它可以在实时速度 (20ms) 下从 n 通道图像输入生成鲁棒的对掌抓取。我们在三个标准数据集上评估了所提出的模型架构，在 Cornell、Jacquard 和 Graspnet 抓取数据集上分别达到了 98.8%、95.1%和 97.4%的新的最先进的精度。实证结果表明，我们的模型在基于 IoU 的更严格的抓取检测指标方面明显优于先前的工作。我们在模拟和现实世界中对一系列具有对抗性几何形状和家用物品的以前未见的不同物体进行了一系列测试。我们通过直接将训练好的模型转移到具有 7 自由度的机器人操纵器上来展示我们方法的适应性，在新的家用和对抗性物体上的抓取成功率分别为 95.4%和 93.0%。此外，我们通过在复杂的 Ravens-10 基准任务上验证我们的像素级抓取预测模型的泛化能力，其中一些任务需要闭环视觉反馈进行多步序列。