Zhang Yang, Chen Xiaobing, Sun Su, You Hongfeng, Wang Yuanyuan, Lin Jianchu, Wang Jiacheng
Huaiyin Institute of Technology, College of Computer and Software Engineering, Huaian, 223003, China.
Key Laboratory of Smart City and Virtual Reality of Jiangsu Province, Huaian, 223002, China.
Sci Rep. 2025 Jul 11;15(1):25155. doi: 10.1038/s41598-025-09825-y.
To address the challenges of low performance in vehicle image detection from UAV aerial imagery, difficulties in small target feature extraction, and the large parameter size of existing models, we propose the OSD-YOLOv10 algorithm, an enhanced version based on YOLOv10n. The proposed algorithm incorporates several key innovations: First, we employ online convolutional reparameterization to construct the OCRConv module and design a lightweight feature extraction structure, SPCC, to replace the conventional C2f module, thereby reducing computational load and parameter count. Second, we integrate an efficient dual-layer feed-forward hybrid attention module to enhance the model's feature extraction capabilities. We also construct a dual small-target detection layer that combines shallow and ultra-shallow features to improve small-target detection. Finally, we introduce the DySample dynamic upsampling module to enhance feature fusion in the neck network from a point sampling perspective. Extensive experiments on the VisDrone-DET2019 and UAVDT datasets demonstrate that OSD-YOLOv10 achieves a 40.7% reduction in parameter count and a 3.6% decrease in floating-point operations, while improving accuracy and mean average precision by 1.3% and 1.6%, respectively. Compared to other YOLO series and lightweight models, OSD-YOLOv10 exhibits superior detection accuracy and lower computational complexity, achieving an optimal balance between high accuracy and low resource consumption. These advancements make it particularly suitable for deployment in UAV onboard hardware for vehicle target detection tasks. Code will be available online ( https://github.com/Z76y/OSD-YOLO ).
为应对无人机航空图像中车辆图像检测性能低下、小目标特征提取困难以及现有模型参数规模大等挑战,我们提出了OSD-YOLOv10算法,这是一种基于YOLOv10n的增强版本。所提出的算法包含多项关键创新:首先,我们采用在线卷积重参数化来构建OCRConv模块,并设计了一种轻量级特征提取结构SPCC来取代传统的C2f模块,从而减少计算量和参数数量。其次,我们集成了一个高效的双层前馈混合注意力模块,以增强模型的特征提取能力。我们还构建了一个结合浅层和超浅层特征的双小目标检测层,以改进小目标检测。最后,我们引入了DySample动态上采样模块,从点采样的角度增强颈部网络中的特征融合。在VisDrone-DET2019和UAVDT数据集上进行的大量实验表明,OSD-YOLOv10的参数数量减少了40.7%,浮点运算减少了3.6%,同时准确率和平均精度分别提高了1.3%和1.6%。与其他YOLO系列和轻量级模型相比,OSD-YOLOv10具有更高的检测准确率和更低的计算复杂度,在高精度和低资源消耗之间实现了最佳平衡。这些进展使其特别适合部署在无人机机载硬件上用于车辆目标检测任务。代码将在网上提供(https://github.com/Z76y/OSD-YOLO)。