Yu Jiamao, Hu Hexuan
College of Computer Science and Software Engineering, Hohai University, Nanjing, 211100, China.
Sci Rep. 2025 Jan 22;15(1):2866. doi: 10.1038/s41598-025-86247-w.
Crowd counting aims to estimate the number, density, and distribution of crowds in an image. While CNN-based crowd counting methods have been effective, head-scale variation and complex background remain two major challenges for crowd counting. Therefore, we propose a multiscale region calibration network called MRCNet to effectively address these challenges. To address the former challenge, we design a multiscale aware module that utilizes multi-branch dilated convolutional parallelism to obtain multiscale receptive fields and cope with drastic changes in head size. For the latter challenge, we design a regional calibration module that calibrates the attention weights of each region after obtaining the attention map to effectively handle challenges in complex contexts. Additionally, we improve the loss function by combining L2 loss and binary cross-entropy loss to help MRCNet achieve excellent results. Extensive experiments were conducted on three mainstream datasets to demonstrate the robustness and competitiveness of our approach.
人群计数旨在估计图像中人群的数量、密度和分布。虽然基于卷积神经网络(CNN)的人群计数方法已经很有效,但头部尺度变化和复杂背景仍然是人群计数的两个主要挑战。因此,我们提出了一种名为MRCNet的多尺度区域校准网络,以有效应对这些挑战。为了解决前一个挑战,我们设计了一个多尺度感知模块,该模块利用多分支扩张卷积并行性来获得多尺度感受野,并应对头部大小的剧烈变化。对于后一个挑战,我们设计了一个区域校准模块,该模块在获得注意力图后校准每个区域的注意力权重,以有效处理复杂背景中的挑战。此外,我们通过结合L2损失和二元交叉熵损失来改进损失函数,以帮助MRCNet取得优异的结果。我们在三个主流数据集上进行了广泛的实验,以证明我们方法的鲁棒性和竞争力。