Suppr超能文献

通过分布校准弥合少样本学习与多样本学习之间的差距

Bridging the Gap Between Few-Shot and Many-Shot Learning via Distribution Calibration.

作者信息

Yang Shuo, Wu Songhua, Liu Tongliang, Xu Min

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9830-9843. doi: 10.1109/TPAMI.2021.3132021. Epub 2022 Nov 7.

Abstract

A major gap between few-shot and many-shot learning is the data distribution empirically oserved by the model during training. In few-shot learning, the learned model can easily become over-fitted based on the biased distribution formed by only a few training examples, while the ground-truth data distribution is more accurately uncovered in many-shot learning to learn a well-generalized model. In this paper, we propose to calibrate the distribution of these few-sample classes to be more unbiased to alleviate such an over-fitting problem. The distribution calibration is achieved by transferring statistics from the classes with sufficient examples to those few-sample classes. After calibration, an adequate number of examples can be sampled from the calibrated distribution to expand the inputs to the classifier. Specifically, we assume every dimension in the feature representation from the same class follows a Gaussian distribution so that the mean and the variance of the distribution can borrow from that of similar classes whose statistics are better estimated with an adequate number of samples. Extensive experiments on three datasets, miniImageNet, tieredImageNet, and CUB, show that a simple linear classifier trained using the features sampled from our calibrated distribution can outperform the state-of-the-art accuracy by a large margin. Besides the favorable performance, the proposed method also exhibits high flexibility by showing consistent accuracy improvement when it is built on top of any off-the-shelf pretrained feature extractors and classification models without extra learnable parameters. The visualization of these generated features demonstrates that our calibrated distribution is an accurate estimation thus the generalization ability gain is convincing. We also establish a generalization error bound for the proposed distribution-calibration-based few-shot learning, which consists of the distribution assumption error, the distribution approximation error, and the estimation error. This generalization error bound theoretically justifies the effectiveness of the proposed method.

摘要

少样本学习和多样本学习之间的一个主要差距在于模型在训练期间凭经验观察到的数据分布。在少样本学习中,基于仅少数训练示例形成的有偏差分布,学习到的模型很容易变得过度拟合,而在多样本学习中能更准确地揭示真实数据分布,从而学习到一个泛化能力良好的模型。在本文中,我们建议校准这些少样本类别的分布,使其偏差更小,以缓解这种过度拟合问题。通过将统计信息从有足够示例的类别转移到那些少样本类别来实现分布校准。校准后,可以从校准后的分布中采样足够数量的示例来扩展分类器的输入。具体来说,我们假设来自同一类别的特征表示中的每个维度都遵循高斯分布,这样分布的均值和方差可以借鉴统计信息通过足够数量样本得到更好估计的相似类别的均值和方差。在三个数据集miniImageNet、tieredImageNet和CUB上进行的大量实验表明,使用从我们校准后的分布中采样的特征训练的简单线性分类器能够大幅超越当前的最优准确率。除了良好的性能外,所提出的方法还具有很高的灵活性,因为当它基于任何现成的预训练特征提取器和分类模型构建时,在没有额外可学习参数的情况下,准确率也能持续提高。对这些生成特征的可视化表明,我们校准后的分布是一个准确的估计,因此泛化能力的提升是令人信服的。我们还为所提出的基于分布校准的少样本学习建立了一个泛化误差界,它由分布假设误差、分布近似误差和估计误差组成。这个泛化误差界从理论上证明了所提出方法的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验