Fu Qiyan, Min Weidong, Sheng Weixiang, Peng Chunjiang
School of Mathematics and Computer Science, Nanchang University, Nanchang, China.
Institute of Metaverse, Nanchang University, Nanchang, China.
Front Neurorobot. 2024 May 16;18:1383943. doi: 10.3389/fnbot.2024.1383943. eCollection 2024.
Accurately counting the number of dense objects in an image, such as pedestrians or vehicles, is a challenging and practical task. The existing density map regression methods based on CNN are mainly used to count a class of dense objects in a single scene. However, in complex traffic scenes, objects such as vehicles and pedestrians usually exist at the same time, and multiple classes of dense objects need to be counted simultaneously.
To solve the above issues, we propose a new multiple types of dense object counting method based on feature enhancement, which can enhance the features of dense counting objects in complex traffic scenes to realize the classification and regression counting of dense vehicles and people. The counting model consists of the regression subnet and the classification subnet. The regression subnet is primarily used to generate two-channel predicted density maps, mainly including the initial feature layer and the feature enhancement layer, in which the feature enhancement layer can enhance the classification features and regression counting features of dense objects in complex traffic scenes. The classification subnet mainly supervises classifying dense vehicles and people into two feature channels to assist the regression counting task of the regression subnets.
Our method is compared on VisDrone+ datasets, ApolloScape+ datasets, and UAVDT+ datasets. The experimental results show that the method counts two kinds of dense objects simultaneously and outputs a high-quality two-channel predicted density map. The counting performance is better than the state-of-the-art counting network in dense people and vehicle counting.
In future work, we will further improve the feature extraction ability of the model in complex traffic scenes to classify and count a variety of dense objects such as cars, pedestrians, and non-motor vehicles.
准确计算图像中密集物体(如行人或车辆)的数量是一项具有挑战性的实际任务。现有的基于卷积神经网络(CNN)的密度图回归方法主要用于对单个场景中的一类密集物体进行计数。然而,在复杂交通场景中,车辆和行人等物体通常同时存在,需要同时对多类密集物体进行计数。
为了解决上述问题,我们提出了一种基于特征增强的新型多类密集物体计数方法,该方法可以增强复杂交通场景中密集计数物体的特征,以实现对密集车辆和行人的分类和回归计数。计数模型由回归子网和分类子网组成。回归子网主要用于生成双通道预测密度图,主要包括初始特征层和特征增强层,其中特征增强层可以增强复杂交通场景中密集物体的分类特征和回归计数特征。分类子网主要用于将密集车辆和行人监督分类到两个特征通道中,以辅助回归子网的回归计数任务。
我们的方法在VisDrone+数据集、ApolloScape+数据集和UAVDT+数据集上进行了比较。实验结果表明,该方法能够同时对两种密集物体进行计数,并输出高质量的双通道预测密度图。在密集人群和车辆计数方面,计数性能优于当前最先进的计数网络。
在未来的工作中,我们将进一步提高模型在复杂交通场景中的特征提取能力,以对汽车、行人、非机动车等多种密集物体进行分类和计数。