一种基于深度信息的自适应多尺度人群计数网络

Zhang Peng, Lei Weimin, Zhao Xinlei, Dong Lijia, Lin Zhaonan

School of Computer Science and Engineering, Northeastern University, Shenyang 110167, China.

Artificial Intelligence Research Institute Shenyang, 213 Electronic Technology Co., Ltd., Shenyang 110023, China.

Sensors (Basel). 2023 Sep 11;23(18):7805. doi: 10.3390/s23187805.

Crowd counting, as a basic computer vision task, plays an important role in many fields such as video surveillance, accident prediction, public security, and intelligent transportation. At present, crowd counting tasks face various challenges. Firstly, due to the diversity of crowd distribution and increasing population density, there is a phenomenon of large-scale crowd aggregation in public places, sports stadiums, and stations, resulting in very serious occlusion. Secondly, when annotating large-scale datasets, positioning errors can also easily affect training results. In addition, the size of human head targets in dense images is not consistent, making it difficult to identify both near and far targets using only one network simultaneously. The existing crowd counting methods mainly use density plot regression methods. However, this framework does not distinguish the features between distant and near targets and cannot adaptively respond to scale changes. Therefore, the detection performance in areas with sparse population distribution is not good. To solve such problems, we propose an adaptive multi-scale far and near distance network based on the convolutional neural network (CNN) framework for counting dense populations and achieving a good balance between accuracy, inference speed, and performance. However, on the feature level, in order to enable the model to distinguish the differences between near and far features, we use stacked convolution layers to deepen the depth of the network, allocate different receptive fields according to the distance between the target and the camera, and fuse the features between nearby targets to enhance the feature extraction ability of pedestrians under nearby targets. Secondly, depth information is used to distinguish distant and near targets of different scales and the original image is cut into four different patches to perform pixel-level adaptive modeling on the population. In addition, we add density normalized average precision (nAP) indicators to analyze the accuracy of our method in spatial positioning. This paper validates the effectiveness of NF-Net on three challenging benchmarks in Shanghai Tech Part A and B, UCF_ CC_50, and UCF-QNRF datasets. Compared with SOTA, it has more significant performance in various scenarios. In the UCF-QNRF dataset, it is further validated that our method effectively solves the interference of complex backgrounds.

人群计数作为一项基本的计算机视觉任务，在视频监控、事故预测、公共安全和智能交通等众多领域发挥着重要作用。目前，人群计数任务面临各种挑战。首先，由于人群分布的多样性和人口密度的增加，在公共场所、体育场和车站存在大规模人群聚集的现象，导致遮挡非常严重。其次，在标注大规模数据集时，定位误差也很容易影响训练结果。此外，密集图像中人头目标的大小不一致，使得仅使用一个网络同时识别远近目标变得困难。现有的人群计数方法主要使用密度图回归方法。然而，该框架没有区分远近目标之间的特征，无法自适应地应对尺度变化。因此，在人口分布稀疏的区域检测性能不佳。为了解决此类问题，我们基于卷积神经网络（CNN）框架提出了一种自适应多尺度远近距离网络，用于对密集人群进行计数，并在准确性、推理速度和性能之间实现良好平衡。然而，在特征层面，为了使模型能够区分远近特征的差异，我们使用堆叠卷积层加深网络深度，根据目标与相机的距离分配不同的感受野，并融合附近目标之间的特征，以增强附近目标下行人的特征提取能力。其次，利用深度信息区分不同尺度的远近目标，并将原始图像切成四个不同的补丁，对人群进行像素级自适应建模。此外，我们添加了密度归一化平均精度（nAP）指标来分析我们的方法在空间定位上的准确性。本文在上海科技大学A和B、UCF_CC_50和UCF-QNRF数据集中的三个具有挑战性的基准上验证了NF-Net的有效性。与现有最优方法相比，它在各种场景下具有更显著的性能。在UCF-QNRF数据集中，进一步验证了我们的方法有效解决了复杂背景的干扰。

相似文献

An Adaptive Multi-Scale Network Based on Depth Information for Crowd Counting.

Sensors (Basel). 2023 Sep 11;23(18):7805. doi: 10.3390/s23187805.

Crowd Counting Based on Multiscale Spatial Guided Perception Aggregation Network.

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17465-17478. doi: 10.1109/TNNLS.2023.3304348. Epub 2024 Dec 2.

Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting.

Sensors (Basel). 2022 Apr 22;22(9):3233. doi: 10.3390/s22093233.

SPCANet: congested crowd counting strip pooling combined attention network.

PeerJ Comput Sci. 2024 Sep 18;10:e2273. doi: 10.7717/peerj-cs.2273. eCollection 2024.

HADF-Crowd: A Hierarchical Attention-Based Dense Feature Extraction Network for Single-Image Crowd Counting.

Sensors (Basel). 2021 May 17;21(10):3483. doi: 10.3390/s21103483.

COMAL: compositional multi-scale feature enhanced learning for crowd counting.

Multimed Tools Appl. 2022;81(15):20541-20560. doi: 10.1007/s11042-022-12249-9. Epub 2022 Mar 11.

Cascaded parallel crowd counting network with multi-resolution collaborative representation.

Appl Intell (Dordr). 2023;53(3):3002-3016. doi: 10.1007/s10489-022-03639-5. Epub 2022 May 19.

An effective modular approach for crowd counting in an image using convolutional neural networks.

Sci Rep. 2022 Apr 6;12(1):5795. doi: 10.1038/s41598-022-09685-w.

Smart Camera Aware Crowd Counting via Multiple Task Fractional Stride Deep Learning.

Sensors (Basel). 2019 Mar 18;19(6):1346. doi: 10.3390/s19061346.

Offset-decoupled deformable convolution for efficient crowd counting.

Sci Rep. 2022 Jul 18;12(1):12229. doi: 10.1038/s41598-022-16415-9.

本文引用的文献

HA-CCN: Hierarchical Attention-based Crowd Counting Network.

IEEE Trans Image Process. 2019 Jul 19. doi: 10.1109/TIP.2019.2928634.

Anomaly detection and localization in crowded scenes.

IEEE Trans Pattern Anal Mach Intell. 2014 Jan;36(1):18-32. doi: 10.1109/TPAMI.2013.111.

Object detection with discriminatively trained part-based models.

IEEE Trans Pattern Anal Mach Intell. 2010 Sep;32(9):1627-45. doi: 10.1109/TPAMI.2009.167.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

An Adaptive Multi-Scale Network Based on Depth Information for Crowd Counting.

Sensors (Basel). 2023 Sep 11;23(18):7805. doi: 10.3390/s23187805.

Crowd Counting Based on Multiscale Spatial Guided Perception Aggregation Network.

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17465-17478. doi: 10.1109/TNNLS.2023.3304348. Epub 2024 Dec 2.

Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting.

Sensors (Basel). 2022 Apr 22;22(9):3233. doi: 10.3390/s22093233.

SPCANet: congested crowd counting strip pooling combined attention network.

PeerJ Comput Sci. 2024 Sep 18;10:e2273. doi: 10.7717/peerj-cs.2273. eCollection 2024.

HADF-Crowd: A Hierarchical Attention-Based Dense Feature Extraction Network for Single-Image Crowd Counting.

Sensors (Basel). 2021 May 17;21(10):3483. doi: 10.3390/s21103483.

COMAL: compositional multi-scale feature enhanced learning for crowd counting.

Multimed Tools Appl. 2022;81(15):20541-20560. doi: 10.1007/s11042-022-12249-9. Epub 2022 Mar 11.

Cascaded parallel crowd counting network with multi-resolution collaborative representation.

Appl Intell (Dordr). 2023;53(3):3002-3016. doi: 10.1007/s10489-022-03639-5. Epub 2022 May 19.

An effective modular approach for crowd counting in an image using convolutional neural networks.

Sci Rep. 2022 Apr 6;12(1):5795. doi: 10.1038/s41598-022-09685-w.

Smart Camera Aware Crowd Counting via Multiple Task Fractional Stride Deep Learning.

Sensors (Basel). 2019 Mar 18;19(6):1346. doi: 10.3390/s19061346.

Offset-decoupled deformable convolution for efficient crowd counting.

Sci Rep. 2022 Jul 18;12(1):12229. doi: 10.1038/s41598-022-16415-9.

本文引用的文献

HA-CCN: Hierarchical Attention-based Crowd Counting Network.

IEEE Trans Image Process. 2019 Jul 19. doi: 10.1109/TIP.2019.2928634.

Anomaly detection and localization in crowded scenes.

IEEE Trans Pattern Anal Mach Intell. 2014 Jan;36(1):18-32. doi: 10.1109/TPAMI.2013.111.

Object detection with discriminatively trained part-based models.

IEEE Trans Pattern Anal Mach Intell. 2010 Sep;32(9):1627-45. doi: 10.1109/TPAMI.2009.167.

An Adaptive Multi-Scale Network Based on Depth Information for Crowd Counting.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献