• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

为基于局部描述符的少样本目标检测辩护。

In defense of local descriptor-based few-shot object detection.

作者信息

Zhou Shichao, Li Haoyan, Wang Zhuowei, Zhang Zekai

机构信息

Key Laboratory of Information and Communication Systems, Ministry of Information Industry, Beijing Information Science and Technology University, Beijing, China.

出版信息

Front Neurosci. 2024 Feb 12;18:1349204. doi: 10.3389/fnins.2024.1349204. eCollection 2024.

DOI:10.3389/fnins.2024.1349204
PMID:38410158
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10894920/
Abstract

State-of-the-art image object detection computational models require an intensive parameter fine-tuning stage (using deep convolution network, etc). with tens or hundreds of training examples. In contrast, human intelligence can robustly learn a new concept from just a few instances (i.e., few-shot detection). The distinctive perception mechanisms between these two families of systems enlighten us to revisit classical handcraft local descriptors (e.g., SIFT, HOG, etc.) as well as non-parametric visual models, which innately require no learning/training phase. Herein, we claim that the inferior performance of these local descriptors mainly results from a lack of global structure sense. To address this issue, we refine local descriptors with spatial contextual attention of neighbor affinities and then embed the local descriptors into discriminative subspace guided by Kernel-InfoNCE loss. Differing from conventional quantization of local descriptors in high-dimensional feature space or isometric dimension reduction, we actually seek a brain-inspired few-shot feature representation for the object manifold, which combines data-independent primitive representation and semantic context learning and thus helps with generalization. The obtained embeddings as pattern vectors/tensors permit us an accelerated but non-parametric visual similarity computation as the decision rule for final detection. Our approach to few-shot object detection is nearly learning-free, and experiments on remote sensing imageries (approximate 2-D affine space) confirm the efficacy of our model.

摘要

最先进的图像目标检测计算模型需要一个密集的参数微调阶段(使用深度卷积网络等),且需要数十个或数百个训练示例。相比之下,人类智能仅从少数示例(即少样本检测)就能稳健地学习新概念。这两类系统之间独特的感知机制启发我们重新审视经典的手工局部描述符(如SIFT、HOG等)以及非参数视觉模型,这些模型本质上不需要学习/训练阶段。在此,我们认为这些局部描述符的性能较差主要是由于缺乏全局结构感。为了解决这个问题,我们利用邻居亲和力的空间上下文注意力来细化局部描述符,然后将局部描述符嵌入到由核信息噪声对比估计损失引导的判别性子空间中。与在高维特征空间中对局部描述符进行传统量化或等距降维不同,我们实际上是为目标流形寻找一种受大脑启发的少样本特征表示,它结合了与数据无关的原始表示和语义上下文学习,从而有助于泛化。作为模式向量/张量获得的嵌入允许我们进行加速但非参数的视觉相似性计算,作为最终检测的决策规则。我们的少样本目标检测方法几乎无需学习,并且在遥感图像(近似二维仿射空间)上的实验证实了我们模型的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/2cd2f5c909ca/fnins-18-1349204-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/b75c0364fbec/fnins-18-1349204-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/6be94df702cb/fnins-18-1349204-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/0e4566d47591/fnins-18-1349204-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/4c29fabe28dc/fnins-18-1349204-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/697e78472535/fnins-18-1349204-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/2cd2f5c909ca/fnins-18-1349204-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/b75c0364fbec/fnins-18-1349204-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/6be94df702cb/fnins-18-1349204-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/0e4566d47591/fnins-18-1349204-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/4c29fabe28dc/fnins-18-1349204-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/697e78472535/fnins-18-1349204-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0b6/10894920/2cd2f5c909ca/fnins-18-1349204-g0006.jpg

相似文献

1
In defense of local descriptor-based few-shot object detection.为基于局部描述符的少样本目标检测辩护。
Front Neurosci. 2024 Feb 12;18:1349204. doi: 10.3389/fnins.2024.1349204. eCollection 2024.
2
One Shot Detection with Laplacian Object and Fast Matrix Cosine Similarity.基于拉普拉斯目标和快速矩阵余弦相似度的单目标检测。
IEEE Trans Pattern Anal Mach Intell. 2016 Mar;38(3):546-62. doi: 10.1109/TPAMI.2015.2453950.
3
Imposing Semantic Consistency of Local Descriptors for Few-Shot Learning.为少样本学习强制实现局部描述符的语义一致性。
IEEE Trans Image Process. 2022;31:1587-1600. doi: 10.1109/TIP.2022.3143692. Epub 2022 Feb 1.
4
Cross-modality interaction for few-shot multispectral object detection with semantic knowledge.基于语义知识的少样本多光谱目标检测的跨模态交互。
Neural Netw. 2024 May;173:106156. doi: 10.1016/j.neunet.2024.106156. Epub 2024 Feb 5.
5
Object and spatial discrimination makes weakly supervised local feature better.目标和空间辨别能力使弱监督局部特征更优。
Neural Netw. 2024 Dec;180:106697. doi: 10.1016/j.neunet.2024.106697. Epub 2024 Sep 12.
6
Decoupled Metric Network for Single-Stage Few-Shot Object Detection.用于单阶段少样本目标检测的解耦度量网络
IEEE Trans Cybern. 2023 Jan;53(1):514-525. doi: 10.1109/TCYB.2022.3149825. Epub 2022 Dec 23.
7
Zero-Shot Learning via Robust Latent Representation and Manifold Regularization.基于鲁棒潜在表示和流形正则化的零样本学习。
IEEE Trans Image Process. 2019 Apr;28(4):1824-1836. doi: 10.1109/TIP.2018.2881926. Epub 2018 Nov 16.
8
Few-shot learning with deformable convolution for multiscale lesion detection in mammography.用于乳腺钼靶多尺度病变检测的基于可变形卷积的少样本学习
Med Phys. 2020 Jul;47(7):2970-2985. doi: 10.1002/mp.14129. Epub 2020 Mar 31.
9
Local Log-Euclidean Multivariate Gaussian Descriptor and Its Application to Image Classification.局部对数欧式多元高斯描述符及其在图像分类中的应用。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):803-817. doi: 10.1109/TPAMI.2016.2560816. Epub 2016 Apr 29.
10
LDAHash: Improved Matching with Smaller Descriptors.LDAHash:使用更小的描述符改进匹配。
IEEE Trans Pattern Anal Mach Intell. 2012 Jan;34(1):66-78. doi: 10.1109/TPAMI.2011.103. Epub 2011 May 19.

本文引用的文献

1
Text Data Augmentation for Deep Learning.用于深度学习的文本数据增强
J Big Data. 2021;8(1):101. doi: 10.1186/s40537-021-00492-0. Epub 2021 Jul 19.
2
Random Access Memories: A New Paradigm for Target Detection in High Resolution Aerial Remote Sensing Images.随机存取存储器:高分辨率航空遥感图像目标检测的新范例。
IEEE Trans Image Process. 2018 Mar;27(3):1100-1111. doi: 10.1109/TIP.2017.2773199.
3
One Shot Detection with Laplacian Object and Fast Matrix Cosine Similarity.基于拉普拉斯目标和快速矩阵余弦相似度的单目标检测。
IEEE Trans Pattern Anal Mach Intell. 2016 Mar;38(3):546-62. doi: 10.1109/TPAMI.2015.2453950.
4
Human-level concept learning through probabilistic program induction.通过概率编程归纳实现人类水平的概念学习。
Science. 2015 Dec 11;350(6266):1332-8. doi: 10.1126/science.aab3050.
5
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
6
Deep hierarchies in the primate visual cortex: what can we learn for computer vision?灵长类视觉皮层的深层层次结构:我们能从中学到什么计算机视觉?
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1847-71. doi: 10.1109/TPAMI.2012.272.
7
Training-free, generic object detection using locally adaptive regression kernels.使用局部自适应回归核的无训练、通用目标检测。
IEEE Trans Pattern Anal Mach Intell. 2010 Sep;32(9):1688-704. doi: 10.1109/TPAMI.2009.153.
8
Reducing the dimensionality of data with neural networks.使用神经网络降低数据维度。
Science. 2006 Jul 28;313(5786):504-7. doi: 10.1126/science.1127647.