Khan Muhammad Asif, Menouar Hamid, Hamila Ridha, Abu-Dayya Adnan
Qatar Mobility Innovations Center, Qatar University, Doha, Qatar.
Electrical Engineering, Qatar University, Doha, Qatar.
Sci Rep. 2025 Apr 8;15(1):11932. doi: 10.1038/s41598-025-90750-5.
Visual crowd counting has gained serious attention during the last couple of years. The consistent contributions to this topic have now solved several inherited challenges such as scale variations, occlusions, and cross-scene applications. However, these works attempt to improve accuracy and often ignore model size and computational complexity. Several practical applications employ resource-limited stand-alone devices like drones to run crowd models and require real-time inference. Though there have been some good efforts to develop lightweight shallow crowd models offering fast inference time, the relevant literature dedicated to lightweight crowd counting is limited. One possible reason is that lightweight deep-learning models suffer from accuracy degradation in complex scenes due to limited generalization capabilities. This paper addresses this important problem by proposing knowledge distillation to improve the learning capability of lightweight crowd models. Knowledge distillation enables lightweight models to emulate deeper models by distilling the knowledge learned by the deeper model during the training process. The paper presents a detailed experimental analysis with three lightweight crowd models over six benchmark datasets. The results report a clear significance of the proposed method supported by several ablation studies.
在过去几年中,视觉人群计数受到了广泛关注。对该主题的持续贡献现已解决了一些固有挑战,如尺度变化、遮挡和跨场景应用。然而,这些工作试图提高准确性,却常常忽略模型大小和计算复杂度。一些实际应用采用资源有限的独立设备(如无人机)来运行人群模型,并要求实时推理。尽管已经做出了一些努力来开发能够提供快速推理时间的轻量级浅层人群模型,但专门针对轻量级人群计数的相关文献却很有限。一个可能的原因是,轻量级深度学习模型由于泛化能力有限,在复杂场景中会出现精度下降的情况。本文通过提出知识蒸馏来提高轻量级人群模型的学习能力,从而解决这一重要问题。知识蒸馏通过在训练过程中提炼更深模型学到的知识,使轻量级模型能够模仿更深的模型。本文在六个基准数据集上对三种轻量级人群模型进行了详细的实验分析。结果表明,在多项消融研究的支持下,所提出的方法具有明显的优势。