基于成分模型的 Fisher 向量编码图像分类。

Compositional Model Based Fisher Vector Coding for Image Classification.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2335-2348. doi: 10.1109/TPAMI.2017.2651061. Epub 2017 Jan 10.

DOI:10.1109/TPAMI.2017.2651061

Abstract

Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC. To alleviate this limitation, in this work, we break the convention which assumes that a local feature is drawn from one of a few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as a linear combination of multiple key components, and the combination weight is a latent random variable. In doing so we greatly enhance the representative power of the generative model underlying FVC. To implement our idea, we design two particular generative models following this compositional approach. In our first model, the mean vector is sampled from the subspace spanned by a set of bases and the combination weight is drawn from a Laplace distribution. In our second model, we further assume that a local feature is composed of a discriminative part and a residual part. As a result, a local feature is generated by the linear combination of discriminative part bases and residual part bases. The decomposition of the discriminative and residual parts is achieved via the guidance of a pre-trained supervised coding method. By calculating the gradient vector of the proposed models, we derive two new Fisher vector coding strategies. The first is termed Sparse Coding-based Fisher Vector Coding (SCFVC) and can be used as the substitute of traditional GMM based FVC. The second is termed Hybrid Sparse Coding-based Fisher vector coding (HSCFVC) since it combines the merits of both pre-trained supervised coding methods and FVC. Using pre-trained Convolutional Neural Network (CNN) activations as local features, we experimentally demonstrate that the proposed methods are superior to traditional GMM based FVC and achieve state-of-the-art performance in various image classification tasks.

摘要

从局部特征生成模型的梯度向量导出的 Fisher 向量编码（FVC）已被确定为图像分类的有效编码方法。如果不是全部，那么大多数 FVC 实现都将高斯混合模型（GMM）用作局部特征的生成模型。然而，GMM 的代表性可能受到限制，因为它本质上假定局部特征可以用固定数量的特征原型来描述，而在 FVC 中，原型的数量通常很小。为了缓解这一限制，在这项工作中，我们打破了假设局部特征是从几个高斯分布之一中抽取的传统观念。相反，我们采用了一种组合机制，假设局部特征是从一个高斯分布中抽取的，该高斯分布的均值向量是由多个关键分量的线性组合构成的，而组合权重是一个潜在的随机变量。通过这样做，我们大大增强了 FVC 底层生成模型的代表性。为了实现我们的想法，我们设计了两种特定的生成模型，它们遵循这种组合方法。在我们的第一个模型中，均值向量是从一组基向量张成的子空间中抽取的，而组合权重是从拉普拉斯分布中抽取的。在我们的第二个模型中，我们进一步假设局部特征由判别部分和残差部分组成。因此，局部特征是通过判别部分基向量和残差部分基向量的线性组合生成的。判别部分和残差部分的分解是通过预训练的监督编码方法的指导来实现的。通过计算所提出模型的梯度向量，我们推导出两种新的 Fisher 向量编码策略。第一种称为基于稀疏编码的 Fisher 向量编码（SCFVC），可以作为传统基于 GMM 的 FVC 的替代品。第二种称为混合基于稀疏编码的 Fisher 向量编码（HSCFVC），因为它结合了预训练的监督编码方法和 FVC 的优点。使用预训练的卷积神经网络（CNN）激活作为局部特征，我们通过实验证明，所提出的方法优于传统的基于 GMM 的 FVC，并在各种图像分类任务中达到了最先进的性能。

相似文献

Compositional Model Based Fisher Vector Coding for Image Classification.基于成分模型的 Fisher 向量编码图像分类。

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2335-2348. doi: 10.1109/TPAMI.2017.2651061. Epub 2017 Jan 10.

Deep FisherNet for Image Classification.深度 Fisher 网络图像分类

IEEE Trans Neural Netw Learn Syst. 2019 Jul;30(7):2244-2250. doi: 10.1109/TNNLS.2018.2874657. Epub 2018 Nov 5.

IEEE Trans Image Process. 2017 Jul;26(7):3221-3234. doi: 10.1109/TIP.2017.2694320. Epub 2017 Apr 13.

Generalized Pooling for Robust Object Tracking.用于鲁棒目标跟踪的广义池化

IEEE Trans Image Process. 2016 Sep;25(9):4199-4208. doi: 10.1109/TIP.2016.2588329. Epub 2016 Jul 7.

Fisher Discrimination Regularized Robust Coding Based on a Local Center for Tumor Classification.基于局部中心的 Fisher 判别正则化鲁棒编码在肿瘤分类中的应用。

Sci Rep. 2018 Jun 14;8(1):9152. doi: 10.1038/s41598-018-27364-7.

Discriminative Learning for Automatic Staging of Placental Maturity via Multi-layer Fisher Vector.基于多层Fisher向量的胎盘成熟度自动分期判别学习

Sci Rep. 2015 Jul 31;5:12818. doi: 10.1038/srep12818.

Training Faster by Separating Modes of Variation in Batch-Normalized Models.通过分离批归一化模型中的变化模式实现更快训练。

IEEE Trans Pattern Anal Mach Intell. 2020 Jun;42(6):1483-1500. doi: 10.1109/TPAMI.2019.2895781. Epub 2019 Jan 28.

Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition.弱监督补丁网络：用于场景识别的局部补丁描述与聚合

IEEE Trans Image Process. 2017 Apr;26(4):2028-2041. doi: 10.1109/TIP.2017.2666739. Epub 2017 Feb 9.

Cross-Convolutional-Layer Pooling for Image Recognition.跨卷积层池化的图像识别。

IEEE Trans Pattern Anal Mach Intell. 2017 Nov;39(11):2305-2313. doi: 10.1109/TPAMI.2016.2637921. Epub 2016 Dec 9.

Locally Supervised Deep Hybrid Model for Scene Recognition.用于场景识别的局部监督深度混合模型

IEEE Trans Image Process. 2017 Feb;26(2):808-820. doi: 10.1109/TIP.2016.2629443. Epub 2016 Nov 16.

引用本文的文献

Tackling over-smoothing in multi-label image classification using graphical convolution neural network.使用图形卷积神经网络解决多标签图像分类中的过度平滑问题。

Evol Syst (Berl). 2022 Sep 7:1-11. doi: 10.1007/s12530-022-09463-z.

CorLabelNet: a comprehensive framework for multi-label chest X-ray image classification with correlation guided discriminant feature learning and oversampling.CorLabelNet：一种用于多标签胸部X光图像分类的综合框架，具有相关性引导的判别特征学习和过采样。

Med Biol Eng Comput. 2025 Apr;63(4):1045-1058. doi: 10.1007/s11517-024-03247-0. Epub 2024 Nov 29.

Research on Chest Disease Recognition Based on Deep Hierarchical Learning Algorithm.基于深度层次学习算法的胸部疾病识别研究。

J Healthc Eng. 2022 Jan 7;2022:6996444. doi: 10.1155/2022/6996444. eCollection 2022.

Introspective analysis of convolutional neural networks for improving discrimination performance and feature visualisation.用于提高判别性能和特征可视化的卷积神经网络的自省分析。

PeerJ Comput Sci. 2021 May 4;7:e497. doi: 10.7717/peerj-cs.497. eCollection 2021.

Spatially-Constrained Fisher Representation for Brain Disease Identification With Incomplete Multi-Modal Neuroimages.基于不完全多模态神经影像的脑疾病识别的空间约束 Fisher 表示。

IEEE Trans Med Imaging. 2020 Sep;39(9):2965-2975. doi: 10.1109/TMI.2020.2983085. Epub 2020 Mar 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于成分模型的 Fisher 向量编码图像分类。

Compositional Model Based Fisher Vector Coding for Image Classification.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献