School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
Sensors (Basel). 2021 May 29;21(11):3777. doi: 10.3390/s21113777.
In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.
在本文中,我们提出了一种新颖的拥挤人群计数网络,即自适应多尺度上下文聚合网络(MSCANet),用于人群密度估计。MSCANet 有效地利用空间上下文信息,在复杂的人群场景中完成人群密度估计。为此,提出了一个多尺度上下文学习块,称为多尺度上下文聚合模块(MSCA),用于首先提取不同尺度的信息,然后自适应地聚合它以捕获人群的全尺度。通过级联使用多个 MSCAs,MSCANet 可以深入利用空间上下文信息,并将初步特征调制为更具区分性和尺度敏感性的特征,最终应用于 1×1 卷积操作,以获得人群密度结果。在三个具有挑战性的人群计数基准上的广泛实验表明,我们的模型在与其他最先进的方法相比时表现出了令人信服的性能。为了彻底证明 MSCANet 的通用性,我们将我们的方法扩展到两个相关任务:人群定位和遥感目标计数。扩展实验结果也证实了 MSCANet 的有效性。