用于使用无人机图像进行多目标检测与识别的集成神经网络框架。

Alshehri Mohammed, Xue Tingting, Mujtaba Ghulam, AlQahtani Yahya, Almujally Nouf Abdullah, Jalal Ahmad, Liu Hui

Department of Computer Science, King Khalid University, Abha, Saudi Arabia.

School of Environmental Science & Engineering, Nanjing University of Information Science and technology, Nanjing, China.

Front Neurorobot. 2025 Jul 30;19:1643011. doi: 10.3389/fnbot.2025.1643011. eCollection 2025.

INTRODUCTION

Accurate vehicle analysis from aerial imagery has become increasingly vital for emerging technologies and public service applications such as intelligent traffic management, urban planning, autonomous navigation, and military surveillance. However, analyzing UAV-captured video poses several inherent challenges, such as the small size of target vehicles, occlusions, cluttered urban backgrounds, motion blur, and fluctuating lighting conditions which hinder the accuracy and consistency of conventional perception systems. To address these complexities, our research proposes a fully end-to-end deep learning-driven perception pipeline specifically optimized for UAV-based traffic monitoring. The proposed framwork integrates multiple advanced modules: RetinexNet for preprocessing, segmentation using HRNet to preserve high-resolution semantic information, and vehicle detection using the YOLOv11 framework. Deep SORT is employed for efficient vehicle tracking, while CSRNet facilitates high-density vehicle counting. LSTM networks are integrated to predict vehicle trajectories based on temporal patterns, and a combination of DenseNet and SuperPoint is utilized for robust feature extraction. Finally, classification is performed using Vision Transformers (ViTs), leveraging attention mechanisms to ensure accurate recognition across diverse categories. The modular yet unified architecture is designed to handle spatiotemporal dynamics, making it suitable for real-time deployment in diverse UAV platforms.

METHOD

The framework suggests using today's best neural networks that are made to solve different problems in aerial vehicle analysis. RetinexNet is used in preprocessing to make the lighting of each input frame consistent. Using HRNet for semantic segmentation allows for accurate splitting between vehicles and their surroundings. YOLOv11 provides high precision and quick vehicle detection and Deep SORT allows reliable tracking without losing track of individual cars. CSRNet are used for vehicle counting that is unaffected by obstacles or traffic jams. LSTM models capture how a car moves in time to forecast future positions. Combining DenseNet and SuperPoint embeddings that were improved with an AutoEncoder is done during feature extraction. In the end, using an attention function, Vision Transformer-based models classify vehicles seen from above. Every part of the system is developed and included to give the improved performance when the UAV is being used in real life.

RESULTS

Our proposed framework significantly improves the accuracy, reliability, and efficiency of vehicle analysis from UAV imagery. Our pipeline was rigorously evaluated on two famous datasets, AU-AIR and Roundabout. On the AU-AIR dataset, the system achieved a detection accuracy of 97.8%, a tracking accuracy of 96.5%, and a classification accuracy of 98.4%. Similarly, on the Roundabout dataset, it reached 96.9% detection accuracy, 94.4% tracking accuracy, and 97.7% classification accuracy. These results surpass previous benchmarks, demonstrating the system's robust performance across diverse aerial traffic scenarios. The integration of advanced models, YOLOv11 for detection, HRNet for segmentation, Deep SORT for tracking, CSRNet for counting, LSTM for trajectory prediction, and Vision Transformers for classification enables the framework to maintain high accuracy even under challenging conditions like occlusion, variable lighting, and scale variations.

DISCUSSION

The outcomes show that the chosen deep learning system is powerful enough to deal with the challenges of aerial vehicle analysis and gives reliable and precise results in all the aforementioned tasks. Combining several advanced models ensures that the system works smoothly even when dealing with problems like people being covered up and varying sizes.

引言

对于智能交通管理、城市规划、自主导航和军事监视等新兴技术及公共服务应用而言，从航空图像中进行精确的车辆分析变得愈发重要。然而，分析无人机拍摄的视频存在若干固有挑战，例如目标车辆尺寸小、遮挡、城市背景杂乱、运动模糊以及光照条件波动，这些都阻碍了传统感知系统的准确性和一致性。为应对这些复杂情况，我们的研究提出了一种完全端到端的深度学习驱动的感知管道，专门针对基于无人机的交通监测进行了优化。所提出的框架集成了多个先进模块：用于预处理的RetinexNet、使用HRNet进行分割以保留高分辨率语义信息、使用YOLOv11框架进行车辆检测。采用Deep SORT进行高效的车辆跟踪，而CSRNet有助于进行高密度车辆计数。集成LSTM网络以基于时间模式预测车辆轨迹，并利用DenseNet和SuperPoint的组合进行强大的特征提取。最后，使用视觉Transformer（ViT）进行分类，利用注意力机制确保跨不同类别进行准确识别。模块化但统一的架构旨在处理时空动态，使其适合在各种无人机平台上进行实时部署。

方法

该框架建议使用当今最好的神经网络来解决航空车辆分析中的不同问题。RetinexNet用于预处理，以使每个输入帧的光照一致。使用HRNet进行语义分割可实现车辆与其周围环境之间的准确分割。YOLOv11提供高精度和快速的车辆检测，而Deep SORT允许可靠跟踪而不会丢失单个车辆的轨迹。CSRNet用于车辆计数，不受障碍物或交通堵塞的影响。LSTM模型捕捉汽车如何随时间移动以预测未来位置。在特征提取过程中，将经过自动编码器改进的DenseNet和SuperPoint嵌入相结合。最后，使用注意力函数，基于视觉Transformer的模型对从上方看到的车辆进行分类。系统的每个部分都经过开发并包含在内，以便在无人机实际使用时提供更高的性能。

结果

我们提出的框架显著提高了从无人机图像中进行车辆分析的准确性、可靠性和效率。我们的管道在两个著名的数据集AU - AIR和Roundabout上进行了严格评估。在AU - AIR数据集上，该系统实现了97.8%的检测准确率、96.5%的跟踪准确率和98.4%的分类准确率。同样，在Roundabout数据集上，它达到了96.9%的检测准确率、94.4%的跟踪准确率和97.7%的分类准确率。这些结果超过了先前的基准，证明了该系统在各种空中交通场景中的强大性能。先进模型的集成，即用于检测的YOLOv11、用于分割的HRNet、用于跟踪的Deep SORT、用于计数的CSRNet、用于轨迹预测的LSTM以及用于分类的视觉Transformer，使该框架即使在遮挡、光照变化和尺度变化等具有挑战性的条件下也能保持高精度。

讨论

结果表明，所选的深度学习系统强大到足以应对航空车辆分析的挑战，并在上述所有任务中给出可靠且精确的结果。结合多个先进模型可确保系统即使在处理人员被遮挡和尺寸变化等问题时也能平稳运行。

相似文献

Integrated neural network framework for multi-object detection and recognition using UAV imagery.

Front Neurorobot. 2025 Jul 30;19:1643011. doi: 10.3389/fnbot.2025.1643011. eCollection 2025.

Prescription of Controlled Substances: Benefits and Risks

Short-Term Memory Impairment

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.

Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.

Autonomous vehicle surveillance through fuzzy C-means segmentation and DeepSORT on aerial images.

PeerJ Comput Sci. 2025 May 1;11:e2835. doi: 10.7717/peerj-cs.2835. eCollection 2025.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

DASNet a dual branch multi level attention sheep counting network.

Sci Rep. 2025 Jul 2;15(1):23228. doi: 10.1038/s41598-025-97929-w.

Multi class aerial image classification in UAV networks employing Snake Optimization Algorithm with Deep Learning.

Sci Rep. 2025 Jul 4;15(1):23872. doi: 10.1038/s41598-025-04570-8.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

本文引用的文献

Unmanned aerial vehicle based multi-person detection via deep neural network models.

Front Neurorobot. 2025 Apr 17;19:1582995. doi: 10.3389/fnbot.2025.1582995. eCollection 2025.

Research on object detection and recognition in remote sensing images based on YOLOv11.

Sci Rep. 2025 Apr 23;15(1):14032. doi: 10.1038/s41598-025-96314-x.

Deep Learning Techniques for Vehicle Detection and Classification from Images/Videos: A Survey.

Sensors (Basel). 2023 May 17;23(10):4832. doi: 10.3390/s23104832.

Vehicle Detection From UAV Imagery With Deep Learning: A Review.

IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6047-6067. doi: 10.1109/TNNLS.2021.3080276. Epub 2022 Oct 27.

Skin lesion segmentation using high-resolution convolutional neural network.

Comput Methods Programs Biomed. 2020 Apr;186:105241. doi: 10.1016/j.cmpb.2019.105241. Epub 2019 Dec 4.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Integrated neural network framework for multi-object detection and recognition using UAV imagery.

Front Neurorobot. 2025 Jul 30;19:1643011. doi: 10.3389/fnbot.2025.1643011. eCollection 2025.

Prescription of Controlled Substances: Benefits and Risks

Short-Term Memory Impairment

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.

Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.

Autonomous vehicle surveillance through fuzzy C-means segmentation and DeepSORT on aerial images.

PeerJ Comput Sci. 2025 May 1;11:e2835. doi: 10.7717/peerj-cs.2835. eCollection 2025.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

DASNet a dual branch multi level attention sheep counting network.

Sci Rep. 2025 Jul 2;15(1):23228. doi: 10.1038/s41598-025-97929-w.

Multi class aerial image classification in UAV networks employing Snake Optimization Algorithm with Deep Learning.

Sci Rep. 2025 Jul 4;15(1):23872. doi: 10.1038/s41598-025-04570-8.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

本文引用的文献

Unmanned aerial vehicle based multi-person detection via deep neural network models.

Front Neurorobot. 2025 Apr 17;19:1582995. doi: 10.3389/fnbot.2025.1582995. eCollection 2025.

Research on object detection and recognition in remote sensing images based on YOLOv11.

Sci Rep. 2025 Apr 23;15(1):14032. doi: 10.1038/s41598-025-96314-x.

Deep Learning Techniques for Vehicle Detection and Classification from Images/Videos: A Survey.

Sensors (Basel). 2023 May 17;23(10):4832. doi: 10.3390/s23104832.

Vehicle Detection From UAV Imagery With Deep Learning: A Review.

IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6047-6067. doi: 10.1109/TNNLS.2021.3080276. Epub 2022 Oct 27.

Skin lesion segmentation using high-resolution convolutional neural network.

Comput Methods Programs Biomed. 2020 Apr;186:105241. doi: 10.1016/j.cmpb.2019.105241. Epub 2019 Dec 4.

Integrated neural network framework for multi-object detection and recognition using UAV imagery.

作者信息

机构信息

出版信息

INTRODUCTION

METHOD

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献