• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于地点识别的带加权三元组损失的空间金字塔增强NetVLAD

Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition.

作者信息

Yu Jun, Zhu Chaoyang, Zhang Jian, Huang Qingming, Tao Dacheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):661-674. doi: 10.1109/TNNLS.2019.2908982. Epub 2019 Apr 26.

DOI:10.1109/TNNLS.2019.2908982
PMID:31034423
Abstract

We propose an end-to-end place recognition model based on a novel deep neural network. First, we propose to exploit the spatial pyramid structure of the images to enhance the vector of locally aggregated descriptors (VLAD) such that the enhanced VLAD features can reflect the structural information of the images. To encode this feature extraction into the deep learning method, we build a spatial pyramid-enhanced VLAD (SPE-VLAD) layer. Next, we impose weight constraints on the terms of the traditional triplet loss (T-loss) function such that the weighted T-loss (WT-loss) function avoids the suboptimal convergence of the learning process. The loss function can work well under weakly supervised scenarios in that it determines the semantically positive and negative samples of each query through not only the GPS tags but also the Euclidean distance between the image representations. The SPE-VLAD layer and the WT-loss layer are integrated with the VGG-16 network or ResNet-18 network to form a novel end-to-end deep neural network that can be easily trained via the standard backpropagation method. We conduct experiments on three benchmark data sets, and the results demonstrate that the proposed model defeats the state-of-the-art deep learning approaches applied to place recognition.

摘要

我们提出了一种基于新型深度神经网络的端到端地点识别模型。首先,我们建议利用图像的空间金字塔结构来增强局部聚合描述符(VLAD)向量,以使增强后的VLAD特征能够反映图像的结构信息。为了将这种特征提取编码到深度学习方法中,我们构建了一个空间金字塔增强VLAD(SPE-VLAD)层。接下来,我们对传统三元组损失(T-loss)函数的项施加权重约束,以使加权三元组损失(WT-loss)函数避免学习过程的次优收敛。该损失函数在弱监督场景下能够很好地工作,因为它不仅通过GPS标签,还通过图像表示之间的欧几里得距离来确定每个查询的语义正样本和负样本。SPE-VLAD层和WT-loss层与VGG-16网络或ResNet-18网络集成,形成一个新型的端到端深度神经网络,该网络可以通过标准反向传播方法轻松训练。我们在三个基准数据集上进行了实验,结果表明所提出的模型击败了应用于地点识别的当前最先进的深度学习方法。

相似文献

1
Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition.用于地点识别的带加权三元组损失的空间金字塔增强NetVLAD
IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):661-674. doi: 10.1109/TNNLS.2019.2908982. Epub 2019 Apr 26.
2
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition.NetVLAD:用于弱监督场景识别的卷积神经网络架构。
IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1437-1451. doi: 10.1109/TPAMI.2017.2711011. Epub 2017 Jun 1.
3
M-SAC-VLADNet: A Multi-Path Deep Feature Coding Model for Visual Classification.M-SAC-VLADNet:一种用于视觉分类的多路径深度特征编码模型。
Entropy (Basel). 2018 May 4;20(5):341. doi: 10.3390/e20050341.
4
Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.基于层次卷积特征的层次递归神经网络哈希图像检索
IEEE Trans Image Process. 2018;27(1):106-120. doi: 10.1109/TIP.2017.2755766.
5
A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.一种使用域转移深度卷积神经网络的新型端到端生物医学图像分类器。
Comput Methods Programs Biomed. 2017 Mar;140:283-293. doi: 10.1016/j.cmpb.2016.12.019. Epub 2017 Jan 6.
6
Good Practices for Learning to Recognize Actions Using FV and VLAD.使用 FV 和 VLAD 学习识别动作的良好实践
IEEE Trans Cybern. 2016 Dec;46(12):2978-2990. doi: 10.1109/TCYB.2015.2493538. Epub 2015 Nov 3.
7
Towards a Robust Visual Place Recognition in Large-Scale vSLAM Scenarios Based on a Deep Distance Learning.基于深度学习的大规模视觉 SLAM 场景中鲁棒的视觉位置识别。
Sensors (Basel). 2021 Jan 5;21(1):310. doi: 10.3390/s21010310.
8
A novel biomedical image indexing and retrieval system via deep preference learning.一种基于深度偏好学习的新型生物医学图像索引和检索系统。
Comput Methods Programs Biomed. 2018 May;158:53-69. doi: 10.1016/j.cmpb.2018.02.003. Epub 2018 Feb 6.
9
Compact Representation of High-Dimensional Feature Vectors for Large-Scale Image Recognition and Retrieval.高维特征向量的紧凑表示在大规模图像识别和检索中的应用。
IEEE Trans Image Process. 2016 May;25(5):2407-19. doi: 10.1109/TIP.2016.2549360.
10
Contextual Patch-NetVLAD: Context-Aware Patch Feature Descriptor and Patch Matching Mechanism for Visual Place Recognition.上下文补丁网络局部聚合描述符:用于视觉场所识别的上下文感知补丁特征描述符和补丁匹配机制
Sensors (Basel). 2024 Jan 28;24(3):855. doi: 10.3390/s24030855.

引用本文的文献

1
Efficient remote sensing image classification using the novel STConvNeXt convolutional network.使用新型STConvNeXt卷积网络进行高效遥感图像分类
Sci Rep. 2025 Mar 11;15(1):8406. doi: 10.1038/s41598-025-92629-x.
2
LoCS-Net: Localizing convolutional spiking neural network for fast visual place recognition.LoCS-Net:用于快速视觉场所识别的局部卷积脉冲神经网络
Front Neurorobot. 2025 Jan 29;18:1490267. doi: 10.3389/fnbot.2024.1490267. eCollection 2024.
3
DINO-Mix enhancing visual place recognition with foundational vision model and feature mixing.
DINO-Mix通过基础视觉模型和特征混合增强视觉场所识别。
Sci Rep. 2024 Sep 27;14(1):22100. doi: 10.1038/s41598-024-73853-3.
4
Convolutional MLP orthogonal fusion of multiscale features for visual place recognition.用于视觉场所识别的多尺度特征卷积MLP正交融合
Sci Rep. 2024 May 23;14(1):11756. doi: 10.1038/s41598-024-62749-x.
5
A prediction error based reversible data hiding scheme in encrypted image using block marking and cover image pre-processing.一种基于预测误差的可逆数据隐藏方案,该方案利用块标记和覆盖图像预处理技术在加密图像中进行数据隐藏。
Multimed Tools Appl. 2023 May 30:1-38. doi: 10.1007/s11042-023-15319-8.
6
Bridging the BCI illiteracy gap: a subject-to-subject semantic style transfer for EEG-based motor imagery classification.弥合脑机接口知识差距:基于脑电图的运动想象分类的受试者间语义风格迁移
Front Hum Neurosci. 2023 May 15;17:1194751. doi: 10.3389/fnhum.2023.1194751. eCollection 2023.
7
Deep learning-based system for automatic prediction of triple-negative breast cancer from ultrasound images.基于深度学习的超声图像三阴性乳腺癌自动预测系统。
Med Biol Eng Comput. 2023 Feb;61(2):567-578. doi: 10.1007/s11517-022-02728-4. Epub 2022 Dec 21.
8
Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study.用于医学图像分析的卷积神经网络池化:一项综述与实证研究
Neural Comput Appl. 2022;34(7):5321-5347. doi: 10.1007/s00521-022-06953-8. Epub 2022 Feb 1.
9
Human-Computer Interaction-Oriented African Literature and African Philosophy Appreciation.面向人机交互的非洲文学与非洲哲学鉴赏
Front Psychol. 2022 Jan 7;12:808414. doi: 10.3389/fpsyg.2021.808414. eCollection 2021.
10
Robust Korean License Plate Recognition Based on Deep Neural Networks.基于深度神经网络的稳健韩国车牌识别。
Sensors (Basel). 2021 Jun 16;21(12):4140. doi: 10.3390/s21124140.