Hu Mengzi, Li Ziyang, Yu Jiong, Wan Xueqiang, Tan Haotian, Lin Zeyu
School of Software, Xinjiang University, Urumqi 830091, China.
College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.
Sensors (Basel). 2023 Jul 15;23(14):6423. doi: 10.3390/s23146423.
The most significant technical challenges of current aerial image object-detection tasks are the extremely low accuracy for detecting small objects that are densely distributed within a scene and the lack of semantic information. Moreover, existing detectors with large parameter scales are unsuitable for aerial image object-detection scenarios oriented toward low-end GPUs. To address this technical challenge, we propose efficient-lightweight You Only Look Once (EL-YOLO), an innovative model that overcomes the limitations of existing detectors and low-end GPU orientation. EL-YOLO surpasses the baseline models in three key areas. Firstly, we design and scrutinize three model architectures to intensify the model's focus on small objects and identify the most effective network structure. Secondly, we design efficient spatial pyramid pooling (ESPP) to augment the representation of small-object features in aerial images. Lastly, we introduce the alpha-complete intersection over union (α-CIoU) loss function to tackle the imbalance between positive and negative samples in aerial images. Our proposed EL-YOLO method demonstrates a strong generalization and robustness for the small-object detection problem in aerial images. The experimental results show that, with the model parameters maintained below 10 M while the input image size was unified at 640 × 640 pixels, the of the EL-YOLOv5 reached 10.8% and 10.7% and enhanced the by 1.9% and 2.2% compared to YOLOv5 on two challenging aerial image datasets, DIOR and VisDrone, respectively.
当前航空图像目标检测任务最显著的技术挑战在于,检测场景中密集分布的小目标时精度极低,且缺乏语义信息。此外,现有参数规模较大的检测器不适用于面向低端GPU的航空图像目标检测场景。为应对这一技术挑战,我们提出了高效轻量级的单阶段多框检测(EL-YOLO),这是一种创新模型,克服了现有检测器的局限性,并针对低端GPU进行了优化。EL-YOLO在三个关键领域超越了基线模型。首先,我们设计并仔细研究了三种模型架构,以增强模型对小目标的关注,并确定最有效的网络结构。其次,我们设计了高效空间金字塔池化(ESPP),以增强航空图像中小目标特征的表示。最后,我们引入了α-完全交并比(α-CIoU)损失函数,以解决航空图像中正负样本之间的不平衡问题。我们提出的EL-YOLO方法在航空图像小目标检测问题上展现出了强大的泛化能力和鲁棒性。实验结果表明,在将模型参数保持在10M以下且输入图像大小统一为640×640像素的情况下,与YOLOv5相比,EL-YOLOv5在两个具有挑战性的航空图像数据集DIOR和VisDrone上的平均精度均值分别达到了10.8%和10.7%,平均精度提升了1.9%和2.2%。