Suppr超能文献

基于深度嵌入网络的多视图三维目标检索

Multi-View 3D Object Retrieval With Deep Embedding Network.

出版信息

IEEE Trans Image Process. 2016 Dec;25(12):5526-5537. doi: 10.1109/TIP.2016.2609814. Epub 2016 Sep 15.

Abstract

In multi-view 3D object retrieval, each object is characterized by a group of 2D images captured from different views. Rather than using hand-crafted features, in this paper, we take advantage of the strong discriminative power of convolutional neural network to learn an effective 3D object representation tailored for this retrieval task. Specifically, we propose a deep embedding network jointly supervised by classification loss and triplet loss to map the high-dimensional image space into a low-dimensional feature space, where the Euclidean distance of features directly corresponds to the semantic similarity of images. By effectively reducing the intra-class variations while increasing the inter-class ones of the input images, the network guarantees that similar images are closer than dissimilar ones in the learned feature space. Besides, we investigate the effectiveness of deep features extracted from different layers of the embedding network extensively and find that an efficient 3D object representation should be a tradeoff between global semantic information and discriminative local characteristics. Then, with the set of deep features extracted from different views, we can generate a comprehensive description for each 3D object and formulate the multi-view 3D object retrieval as a set-to-set matching problem. Extensive experiments on SHREC'15 data set demonstrate the superiority of our proposed method over the previous state-of-the-art approaches with over 12% performance improvement.

摘要

在多视角 3D 目标检索中,每个目标都由从不同视角捕获的一组 2D 图像来描述。本文不采用手工制作的特征,而是利用卷积神经网络的强大判别能力,学习针对此检索任务的有效 3D 目标表示。具体来说,我们提出了一种深度嵌入网络,该网络联合由分类损失和三元组损失监督,将高维图像空间映射到低维特征空间,其中特征的欧几里得距离直接对应于图像的语义相似性。通过有效地减少输入图像的类内变化,同时增加类间变化,该网络保证了在学习的特征空间中相似的图像比不相似的图像更接近。此外,我们广泛研究了从嵌入网络的不同层提取的深度特征的有效性,并发现有效的 3D 目标表示应该是全局语义信息和判别局部特征之间的权衡。然后,我们可以使用从不同视角提取的深度特征集为每个 3D 对象生成全面描述,并将多视角 3D 对象检索形式化为集到集匹配问题。在 SHREC'15 数据集上进行的广泛实验表明,与之前的最先进方法相比,我们提出的方法具有超过 12%的性能提升。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验