Xie Jinyang, Gu Lingyu, Li Zhonghui, Lyu Lei
School of Information Science and Engineering, Shandong Normal University, 250358 Jinan, China.
Jinan Rail Transit Group Engineering Research Consultation Co., Ltd, 250101 Jinan, China.
Appl Intell (Dordr). 2022;52(11):12191-12205. doi: 10.1007/s10489-021-03030-w. Epub 2022 Feb 2.
Aiming to tackle the most intractable problems of scale variation and complex backgrounds in crowd counting, we present an innovative framework called Hierarchical Region-Aware Network (HRANet) for crowd counting in this paper, which can better focus on crowd regions to accurately predict crowd density. In our implementation, first, we design a Region-Aware Module (RAM) to capture the internal differences within different regions of the feature map, thus adaptively extracting contextual features within different regions. Furthermore, we propose a Region Recalibration Module (RRM) which adopts a novel region-aware attention mechanism (RAAM) to further recalibrate the feature weights of different regions. By the integration of the above two modules, the influence of background regions can be effectively suppressed. Besides, considering the local correlations within different regions of the crowd density map, a Region Awareness Loss (RAL) is designed to reduce false identification while producing the locally consistent density map. Extensive experiments on five challenging datasets demonstrate that the proposed method significantly outperforms existing methods in terms of counting accuracy and quality of the generated density map. In addition, a series of specific experiments in crowd gathering scenes indicate that our method can be practically applied to crowd localization.
为了解决人群计数中尺度变化和复杂背景这两个最棘手的问题,我们在本文中提出了一种名为分层区域感知网络(HRANet)的创新框架用于人群计数,它能够更好地聚焦人群区域以准确预测人群密度。在我们的实现过程中,首先,我们设计了一个区域感知模块(RAM)来捕捉特征图不同区域内的内部差异,从而自适应地提取不同区域内的上下文特征。此外,我们提出了一个区域重新校准模块(RRM),它采用了一种新颖的区域感知注意力机制(RAAM)来进一步重新校准不同区域的特征权重。通过整合上述两个模块,可以有效抑制背景区域的影响。此外,考虑到人群密度图不同区域内的局部相关性,设计了一种区域感知损失(RAL),以在生成局部一致的密度图时减少误识别。在五个具有挑战性的数据集上进行的大量实验表明,所提出的方法在计数准确性和生成密度图的质量方面显著优于现有方法。此外,在人群聚集场景中的一系列具体实验表明,我们的方法可以实际应用于人群定位。