• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于三维模型的对象识别的坐标度量学习生成模型。

Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models.

出版信息

IEEE Trans Image Process. 2018 Dec;27(12):5813-5826. doi: 10.1109/TIP.2018.2858553. Epub 2018 Jul 23.

DOI:10.1109/TIP.2018.2858553
PMID:30040643
Abstract

One of the bottlenecks in acquiring a perfect database for deep learning is the tedious process of collecting and labeling data. In this paper, we propose a generative model trained with synthetic images rendered from 3D models which can reduce the burden on collecting real training data and make the background conditions more realistic. Our architecture is composed of two sub-networks: a semantic foreground object reconstruction network based on Bayesian inference and a classification network based on multi-triplet cost training for avoiding overfitting on the monotone synthetic object surface and utilizing accurate information of synthetic images like object poses and lighting conditions which are helpful for recognizing regular photos. First, our generative model with metric learning utilizes additional foreground object channels generated from semantic foreground object reconstruction sub-network for recognizing the original input images. Multi-triplet cost function based on poses is used for metric learning which makes it possible to train an effective categorical classifier purely based on synthetic data. Second, we design a coordinate training strategy with the help of adaptive noise applied on the inputs of both of the concatenated sub-networks to make them benefit from each other and avoid inharmonious parameter tuning due to different convergence speeds of two sub-networks. Our architecture achieves the state-of-the-art accuracy of 50.5% on the ShapeNet database with data migration obstacle from synthetic images to real images. This pipeline makes it applicable to do recognition on real images only based on 3D models. Our codes are available at https://github.com/wangyida/gm-cml.

摘要

获取用于深度学习的完美数据库的一个瓶颈是收集和标记数据的繁琐过程。在本文中,我们提出了一种使用从 3D 模型渲染的合成图像训练的生成模型,它可以减轻收集真实训练数据的负担,并使背景条件更加真实。我们的架构由两个子网络组成:基于贝叶斯推理的语义前景对象重建网络和基于多三元组成本训练的分类网络,以避免对单调合成对象表面的过拟合,并利用合成图像的准确信息,如对象姿势和光照条件,这有助于识别常规照片。首先,我们的生成模型利用度量学习,利用语义前景对象重建子网络生成的附加前景对象通道来识别原始输入图像。基于姿势的多三元组成本函数用于度量学习,这使得仅基于合成数据训练有效的分类器成为可能。其次,我们设计了一种坐标训练策略,在两个连接的子网络的输入上应用自适应噪声,使它们相互受益,并避免由于两个子网络的收敛速度不同而导致的不协调参数调整。我们的架构在 ShapeNet 数据库上实现了 50.5%的最新精度,同时克服了从合成图像到真实图像的数据迁移障碍。该流水线使得仅基于 3D 模型即可在真实图像上进行识别成为可能。我们的代码可在 https://github.com/wangyida/gm-cml 上获得。

相似文献

1
Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models.基于三维模型的对象识别的坐标度量学习生成模型。
IEEE Trans Image Process. 2018 Dec;27(12):5813-5826. doi: 10.1109/TIP.2018.2858553. Epub 2018 Jul 23.
2
A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.一种使用域转移深度卷积神经网络的新型端到端生物医学图像分类器。
Comput Methods Programs Biomed. 2017 Mar;140:283-293. doi: 10.1016/j.cmpb.2016.12.019. Epub 2017 Jan 6.
3
Semi-Supervised Generative Adversarial Nets with Multiple Generators for SAR Image Recognition.基于多个生成器的半监督生成对抗网络在 SAR 图像识别中的应用。
Sensors (Basel). 2018 Aug 17;18(8):2706. doi: 10.3390/s18082706.
4
DLBI: deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy.DLBI:用于超分辨率荧光显微镜结构重建的深度学习引导贝叶斯推断。
Bioinformatics. 2018 Jul 1;34(13):i284-i294. doi: 10.1093/bioinformatics/bty241.
5
RotationNet for Joint Object Categorization and Unsupervised Pose Estimation from Multi-View Images.用于多视图图像联合目标分类和无监督姿态估计的旋转网络
IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):269-283. doi: 10.1109/TPAMI.2019.2922640. Epub 2020 Dec 4.
6
High-Speed Railway Intruding Object Image Generating with Generative Adversarial Networks.基于生成对抗网络的高速铁路入侵物体图像生成
Sensors (Basel). 2019 Jul 11;19(14):3075. doi: 10.3390/s19143075.
7
Multi-View 3D Object Retrieval With Deep Embedding Network.基于深度嵌入网络的多视图三维目标检索
IEEE Trans Image Process. 2016 Dec;25(12):5526-5537. doi: 10.1109/TIP.2016.2609814. Epub 2016 Sep 15.
8
Quadruplet Network With One-Shot Learning for Fast Visual Object Tracking.用于快速视觉目标跟踪的单样本学习四元组网络
IEEE Trans Image Process. 2019 Jul;28(7):3516-3527. doi: 10.1109/TIP.2019.2898567. Epub 2019 Feb 11.
9
A Global-Local Self-Adaptive Network for Drone-View Object Detection.一种用于无人机视角目标检测的全局-局部自适应网络。
IEEE Trans Image Process. 2021;30:1556-1569. doi: 10.1109/TIP.2020.3045636. Epub 2021 Jan 5.
10
Learning from Synthetic Images via Active Pseudo-Labeling.
IEEE Trans Image Process. 2020 May 5. doi: 10.1109/TIP.2020.2989100.

引用本文的文献

1
Model-Free Lens Distortion Correction Based on Phase Analysis of Fringe-Patterns.基于条纹图案相位分析的无模型镜头畸变校正
Sensors (Basel). 2020 Dec 31;21(1):209. doi: 10.3390/s21010209.