Luo Xin, Li Jiatian, A Xiaohui, Deng Yuxi
Faculty of Land and Resources Engineering, Kunming University of Science and Technology, Kunming 650093, China.
Sensors (Basel). 2025 Jan 7;25(2):306. doi: 10.3390/s25020306.
To address the challenges of missed detections caused by insufficient shape and texture features and blurred boundaries in existing detection methods, this paper introduces a novel moving vehicle detection approach for satellite videos. The proposed method leverages frame difference and convolution to effectively integrate spatiotemporal information. First, a frame difference module (FDM) is designed, combining frame difference and convolution. This module extracts motion features between adjacent frames using frame difference, refines them through backpropagation in the neural network, and integrates them with the current frame to compensate for the missing motion features in single-frame images. Next, the initial features are processed by a backbone network to further extract spatiotemporal feature information. The neck incorporates deformable convolution, which adaptively adjusts convolution kernel sampling positions, optimizing feature representation and enabling effective multiscale information integration. Additionally, shallow large-scale feature maps, which use smaller receptive fields to focus on small targets and reduce background interference, are fed into the detection head. To enhance small-target feature representation, a small-target self-reconstruction module (SR-TOD) is introduced between the neck and the detection head. Experiments using the Jilin-1 satellite video dataset demonstrate that the proposed method outperforms comparison models, significantly reducing missed detections caused by weak color and texture features and blurred boundaries. For the satellite-video moving vehicle detection task, this method achieves notable improvements, with an average F1-score increase of 3.9% and a per-frame processing speed enhancement of 7 s compared to the next best model, DSFNet.
为应对现有检测方法中因形状和纹理特征不足以及边界模糊导致的漏检挑战,本文介绍了一种用于卫星视频的新型移动车辆检测方法。所提出的方法利用帧差和卷积来有效整合时空信息。首先,设计了一个帧差模块(FDM),将帧差与卷积相结合。该模块利用帧差提取相邻帧之间的运动特征,通过神经网络中的反向传播对其进行细化,并将它们与当前帧进行整合,以补偿单帧图像中缺失的运动特征。接下来,初始特征由主干网络进行处理,以进一步提取时空特征信息。颈部采用可变形卷积,它能自适应调整卷积核采样位置,优化特征表示并实现有效的多尺度信息整合。此外,使用较小感受野以聚焦小目标并减少背景干扰的浅层大尺度特征图被输入到检测头中。为增强小目标特征表示,在颈部和检测头之间引入了一个小目标自重建模块(SR-TOD)。使用吉林一号卫星视频数据集进行的实验表明,所提出的方法优于比较模型,显著减少了因颜色和纹理特征较弱以及边界模糊导致的漏检。对于卫星视频移动车辆检测任务,该方法取得了显著改进,与次优模型DSFNet相比,平均F1分数提高了3.9%,每帧处理速度提高了7秒。