深度卷积神经网络与全局协方差池化：更好的表示和泛化。

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2582-2597. doi: 10.1109/TPAMI.2020.2974833. Epub 2021 Jul 1.

DOI:10.1109/TPAMI.2020.2974833

PMID:32086198

Abstract

Compared with global average pooling in existing deep convolutional neural networks (CNNs), global covariance pooling can capture richer statistics of deep features, having potential for improving representation and generalization abilities of deep CNNs. However, integration of global covariance pooling into deep CNNs brings two challenges: (1) robust covariance estimation given deep features of high dimension and small sample size; (2) appropriate usage of geometry of covariances. To address these challenges, we propose a global Matrix Power Normalized COVariance (MPN-COV) Pooling. Our MPN-COV conforms to a robust covariance estimator, very suitable for scenario of high dimension and small sample size. It can also be regarded as Power-Euclidean metric between covariances, effectively exploiting their geometry. Furthermore, a global Gaussian embedding network is proposed to incorporate first-order statistics into MPN-COV. For fast training of MPN-COV networks, we implement an iterative matrix square root normalization, avoiding GPU unfriendly eigen-decomposition inherent in MPN-COV. Additionally, progressive 1×1 convolutions and group convolution are introduced to compress covariance representations. The proposed methods are highly modular, readily plugged into existing deep CNNs. Extensive experiments are conducted on large-scale object classification, scene categorization, fine-grained visual recognition and texture classification, showing our methods outperform the counterparts and obtain state-of-the-art performance.

摘要

与现有深度卷积神经网络 (CNN) 中的全局平均池化相比，全局协方差池化可以捕获更深特征的更丰富的统计信息，具有提高深度 CNN 表示和泛化能力的潜力。然而，将全局协方差池化集成到深度 CNN 中带来了两个挑战：(1) 对高维和小样本量的深度特征进行稳健的协方差估计；(2) 适当利用协方差的几何形状。为了解决这些挑战，我们提出了全局矩阵幂归一化协方差（MPN-COV）池化。我们的 MPN-COV 符合稳健的协方差估计器，非常适合高维和小样本量的情况。它也可以看作是协方差之间的幂欧几里得度量，有效地利用了它们的几何形状。此外，提出了一种全局高斯嵌入网络，将一阶统计量纳入 MPN-COV 中。为了快速训练 MPN-COV 网络，我们实现了迭代矩阵平方根归一化，避免了 MPN-COV 中固有的 GPU 不友好的特征分解。此外，还引入了渐进式 1×1 卷积和分组卷积来压缩协方差表示。所提出的方法具有高度的模块化，可以轻松地插入到现有的深度 CNN 中。在大规模目标分类、场景分类、细粒度视觉识别和纹理分类等方面进行了广泛的实验，结果表明，我们的方法优于对照方法，并获得了最先进的性能。

相似文献

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization.深度卷积神经网络与全局协方差池化：更好的表示和泛化。

IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2582-2597. doi: 10.1109/TPAMI.2020.2974833. Epub 2021 Jul 1.

Second-order asymmetric convolution network for breast cancer histopathology image classification.用于乳腺癌组织病理学图像分类的二阶非对称卷积网络

J Biophotonics. 2022 May;15(5):e202100370. doi: 10.1002/jbio.202100370. Epub 2022 Feb 9.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

Fine-Tuning CNN Image Retrieval with No Human Annotation.无人工标注微调卷积神经网络图像检索。

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1655-1668. doi: 10.1109/TPAMI.2018.2846566. Epub 2018 Jun 12.

Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition.基于深度注意力的空间递归网络在细粒度视觉识别中的应用

IEEE Trans Cybern. 2019 May;49(5):1791-1802. doi: 10.1109/TCYB.2018.2813971. Epub 2018 Mar 22.

A failure to learn object shape geometry: Implications for convolutional neural networks as plausible models of biological vision.未能学习物体形状几何：对卷积神经网络作为生物视觉合理模型的影响。

Vision Res. 2021 Dec;189:81-92. doi: 10.1016/j.visres.2021.09.004. Epub 2021 Oct 8.

The Whole Is More Than Its Parts? From Explicit to Implicit Pose Normalization.整体大于部分？从显式到隐式姿态归一化。

IEEE Trans Pattern Anal Mach Intell. 2020 Mar;42(3):749-763. doi: 10.1109/TPAMI.2018.2885764. Epub 2018 Dec 18.

Automatically Designing CNN Architectures Using the Genetic Algorithm for Image Classification.使用遗传算法自动设计用于图像分类的 CNN 架构。

IEEE Trans Cybern. 2020 Sep;50(9):3840-3854. doi: 10.1109/TCYB.2020.2983860. Epub 2020 Apr 21.

Multi-Scale Feature Fusion of Covariance Pooling Networks for Fine-Grained Visual Recognition.协方差池化网络的多尺度特征融合用于细粒度视觉识别。

Sensors (Basel). 2023 Apr 13;23(8):3970. doi: 10.3390/s23083970.

A novel feature representation: Aggregating convolution kernels for image retrieval.一种新颖的特征表示：聚合卷积核进行图像检索。

Neural Netw. 2020 Oct;130:1-10. doi: 10.1016/j.neunet.2020.06.010. Epub 2020 Jun 24.

引用本文的文献

DBRSNet: a dual-branch remote sensing image segmentation model based on feature interaction and multi-scale feature fusion.DBRSNet：一种基于特征交互和多尺度特征融合的双分支遥感图像分割模型。

Sci Rep. 2025 Jul 30;15(1):27786. doi: 10.1038/s41598-025-13236-4.

Interweaving Insights: High-Order Feature Interaction for Fine-Grained Visual Recognition.交织洞察：用于细粒度视觉识别的高阶特征交互

Int J Comput Vis. 2025;133(4):1755-1779. doi: 10.1007/s11263-024-02260-y. Epub 2024 Oct 20.

A bimodal deep learning network based on CNN for fine motor imagery.一种基于卷积神经网络的用于精细运动想象的双峰深度学习网络。

Cogn Neurodyn. 2024 Dec;18(6):3791-3804. doi: 10.1007/s11571-024-10159-0. Epub 2024 Aug 19.

Learning to integrate parts for whole through correlated neural variability.通过相关的神经变异性来学习整合整体部分。

PLoS Comput Biol. 2024 Sep 3;20(9):e1012401. doi: 10.1371/journal.pcbi.1012401. eCollection 2024 Sep.

Linear optimal transport subspaces for point set classification.用于点集分类的线性最优传输子空间

Res Sq. 2024 Mar 22:rs.3.rs-4106387. doi: 10.21203/rs.3.rs-4106387/v1.

Diagnosis of schizophrenia with functional connectome data: a graph-based convolutional neural network approach.基于功能连接组学数据的精神分裂症诊断：图卷积神经网络方法。

BMC Neurosci. 2022 Jan 17;23(1):5. doi: 10.1186/s12868-021-00682-9.

Efficacy of liver cancer microwave ablation through ultrasonic image guidance under deep migration feature algorithm.基于深度迁移特征算法的超声图像引导下肝癌微波消融疗效

Pak J Med Sci. 2021;37(6):1693-1698. doi: 10.12669/pjms.37.6-WIT.4885.

Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval.用于无监督跨模态检索的深度语义保持重构哈希

Entropy (Basel). 2020 Nov 7;22(11):1266. doi: 10.3390/e22111266.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度卷积神经网络与全局协方差池化：更好的表示和泛化。

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献