Yang Qinghua, Liu Bin, Tian Yan, Shi Yangming, Du Xinxin, He Fangyuan, Guo Jikun
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing, China.
Energy Ningxia Coal Industry Co., Ltd., YangChangWan Coal Mine, YingChuan, NingXia, China.
PLoS One. 2025 Jun 6;20(6):e0322360. doi: 10.1371/journal.pone.0322360. eCollection 2025.
Few-shot learning techniques have enabled the rapid adaptation of a general AI model to various tasks using limited data. In this study, we focus on class-agnostic low-shot object counting, a challenging problem that aims to achieve accurate object counting with only a few annotated samples (few-shot) or even in the absence of any annotated data (zero-shot). In existing methods, the primary focus is often on enhancing performance, while relatively little attention is given to inference time-an equally critical factor in many practical applications. We propose a model that achieves real-time inference without compromising performance. Specifically, we design a multi-scale hybrid encoder to enhance feature representation and optimize computational efficiency. This encoder applies self-attention exclusively to high-level features and cross-scale fusion modules to integrate adjacent features, reducing training costs. Additionally, we introduce a learnable shape embedding and an iterative exemplar feature learning module, that progressively enriches exemplar features with class-level characteristics by learning from similar objects within the image, which are essential for improving subsequent matching performance. Extensive experiments on the FSC147, Val-COCO, Test-COCO, CARPK, and ShanghaiTech datasets demonstrate our model's effectiveness and generalizability compared to state-of-the-art methods.
少样本学习技术使通用人工智能模型能够利用有限的数据快速适应各种任务。在本研究中,我们专注于类别无关的少样本目标计数,这是一个具有挑战性的问题,旨在仅使用少量带注释样本(少样本)甚至在没有任何带注释数据(零样本)的情况下实现准确的目标计数。在现有方法中,主要关注点通常是提高性能,而在许多实际应用中同样关键的推理时间却相对很少受到关注。我们提出了一种在不影响性能的情况下实现实时推理的模型。具体而言,我们设计了一种多尺度混合编码器来增强特征表示并优化计算效率。该编码器仅将自注意力应用于高级特征,并使用跨尺度融合模块来整合相邻特征,从而降低训练成本。此外,我们引入了一个可学习的形状嵌入和一个迭代示例特征学习模块,该模块通过从图像中的相似对象学习来逐步丰富具有类别级特征的示例特征,这对于提高后续匹配性能至关重要。在FSC147、Val-COCO、Test-COCO、CARPK和上海科技数据集上进行的大量实验表明,与现有技术方法相比,我们的模型具有有效性和通用性。