• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DON6D:用于6D姿态估计的解耦单阶段网络。

DON6D: a decoupled one-stage network for 6D pose estimation.

作者信息

Wang Zheng, Tu Hangyao, Qian Yutong, Zhao Yanwei

机构信息

School of Computer and Computational Sciences, Hangzhou City University, Hangzhou, 310015, China.

School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China.

出版信息

Sci Rep. 2024 Apr 10;14(1):8410. doi: 10.1038/s41598-024-59152-x.

DOI:10.1038/s41598-024-59152-x
PMID:38600244
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11385229/
Abstract

The six-dimensional (6D) pose object estimation is a key task in robotic manipulation and grasping scenes. Many existing two-stage solutions with a slow inference speed require extra refinement to handle the challenges of variations in lighting, sensor noise, object occlusion, and truncation. To address these challenges, this work proposes a decoupled one-stage network (DON6D) model for 6D pose estimation that improves inference speed on the premise of maintaining accuracy. Particularly, since the RGB images are aligned with the RGB-D images, the proposed DON6D first uses a two-dimensional detection network to locate the interested objects in RGB-D images. Then, a module of feature extraction and fusion is used to extract color and geometric features fully. Further, dual data augmentation is performed to enhance the generalization ability of the proposed model. Finally, the features are fused, and an attention residual encoder-decoder, which can improve the pose estimation performance to obtain an accurate 6D pose, is introduced. The proposed DON6D model is evaluated on the LINEMOD and YCB-Video datasets. The results demonstrate that the proposed DON6D is superior to several state-of-the-art methods regarding the ADD(-S) and ADD(-S) AUC metrics.

摘要

六维(6D)姿态目标估计是机器人操作和抓取场景中的一项关键任务。许多现有的两阶段解决方案推理速度较慢,需要额外的优化来应对光照变化、传感器噪声、目标遮挡和截断等挑战。为应对这些挑战,这项工作提出了一种用于6D姿态估计的解耦单阶段网络(DON6D)模型,该模型在保持准确性的前提下提高了推理速度。特别地,由于RGB图像与RGB-D图像对齐,所提出的DON6D首先使用二维检测网络在RGB-D图像中定位感兴趣的目标。然后,使用一个特征提取和融合模块来充分提取颜色和几何特征。此外,进行双重数据增强以提高所提模型的泛化能力。最后,融合特征,并引入一个注意力残差编码器-解码器,其可以提高姿态估计性能以获得准确的6D姿态。所提出的DON6D模型在LINEMOD和YCB-Video数据集上进行了评估。结果表明,在所提出的DON6D在ADD(-S)和ADD(-S) AUC指标方面优于几种现有最先进的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/2b3e67a55c81/41598_2024_59152_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/b02f071b77df/41598_2024_59152_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/1f7f98369e3b/41598_2024_59152_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/94f3fddcbccf/41598_2024_59152_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/a24aa66f4432/41598_2024_59152_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/8b4fd13786cb/41598_2024_59152_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/2a8824a3a1a1/41598_2024_59152_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/9fa8aedd84ca/41598_2024_59152_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/dfd091b81227/41598_2024_59152_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/2b3e67a55c81/41598_2024_59152_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/b02f071b77df/41598_2024_59152_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/1f7f98369e3b/41598_2024_59152_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/94f3fddcbccf/41598_2024_59152_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/a24aa66f4432/41598_2024_59152_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/8b4fd13786cb/41598_2024_59152_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/2a8824a3a1a1/41598_2024_59152_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/9fa8aedd84ca/41598_2024_59152_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/dfd091b81227/41598_2024_59152_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd27/11385229/2b3e67a55c81/41598_2024_59152_Fig9_HTML.jpg

相似文献

1
DON6D: a decoupled one-stage network for 6D pose estimation.DON6D:用于6D姿态估计的解耦单阶段网络。
Sci Rep. 2024 Apr 10;14(1):8410. doi: 10.1038/s41598-024-59152-x.
2
6D Object Pose Estimation Based on Cross-Modality Feature Fusion.基于跨模态特征融合的6D目标位姿估计
Sensors (Basel). 2023 Sep 26;23(19):8088. doi: 10.3390/s23198088.
3
A Manufacturing-Oriented Intelligent Vision System Based on Deep Neural Network for Object Recognition and 6D Pose Estimation.一种基于深度神经网络的面向制造的智能视觉系统,用于目标识别和6D位姿估计。
Front Neurorobot. 2021 Jan 7;14:616775. doi: 10.3389/fnbot.2020.616775. eCollection 2020.
4
DOPE++: 6D pose estimation algorithm for weakly textured objects based on deep neural networks.DOPE++:基于深度神经网络的弱纹理物体 6D 位姿估计算法。
PLoS One. 2022 Jun 8;17(6):e0269175. doi: 10.1371/journal.pone.0269175. eCollection 2022.
5
Instance-level 6D pose estimation based on multi-task parameter sharing for robotic grasping.基于多任务参数共享的实例级6D姿态估计用于机器人抓取。
Sci Rep. 2024 Apr 2;14(1):7801. doi: 10.1038/s41598-024-58590-x.
6
Multi-level feature fusion and joint refinement for simultaneous object pose estimation and camera localization.用于同时进行目标位姿估计和相机定位的多层次特征融合和联合细化。
Neural Netw. 2024 Jun;174:106238. doi: 10.1016/j.neunet.2024.106238. Epub 2024 Mar 16.
7
6IMPOSE: bridging the reality gap in 6D pose estimation for robotic grasping.6IMPOSE:弥合机器人抓取6D位姿估计中的现实差距。
Front Robot AI. 2023 Sep 27;10:1176492. doi: 10.3389/frobt.2023.1176492. eCollection 2023.
8
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.6D-ViT:基于Transformer的实例表示学习的类别级6D物体姿态估计
IEEE Trans Image Process. 2022;31:6907-6921. doi: 10.1109/TIP.2022.3216980. Epub 2022 Nov 3.
9
6-D Object Pose Estimation Based on Point Pair Matching for Robotic Grasp Detection.基于点对匹配的6D物体姿态估计用于机器人抓取检测
IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):11902-11916. doi: 10.1109/TNNLS.2024.3442433.
10
Iterative Pose Refinement for Object Pose Estimation Based on RGBD Data.基于 RGBD 数据的目标位姿估计的迭代姿态精化。
Sensors (Basel). 2020 Jul 24;20(15):4114. doi: 10.3390/s20154114.

本文引用的文献

1
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.6D-ViT:基于Transformer的实例表示学习的类别级6D物体姿态估计
IEEE Trans Image Process. 2022;31:6907-6921. doi: 10.1109/TIP.2022.3216980. Epub 2022 Nov 3.
2
Research on the Industrial Robot Grasping Method Based on Multisensor Data Fusion and Binocular Vision.基于多传感器数据融合与双目视觉的工业机器人抓取方法研究。
Comput Intell Neurosci. 2022 May 25;2022:4443100. doi: 10.1155/2022/4443100. eCollection 2022.
3
A Manufacturing-Oriented Intelligent Vision System Based on Deep Neural Network for Object Recognition and 6D Pose Estimation.
一种基于深度神经网络的面向制造的智能视觉系统,用于目标识别和6D位姿估计。
Front Neurorobot. 2021 Jan 7;14:616775. doi: 10.3389/fnbot.2020.616775. eCollection 2020.
4
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.