Nguyen Minh Tai Pham, Nguyen Quoc Duy Nam, Le Hoang Viet Anh, Tran Minh Khue Phan, Nakano Tadashi, Tran Thi Hong
Faculty of Advanced Program, Ho Chi Minh City Open University, Ho Chi Minh City, 700000, Vietnam.
Department of Core Informatics, Graduate School of Informatics, Osaka Metropolitan University, Osaka, 558-8585, Japan.
Sci Rep. 2025 Aug 5;15(1):28629. doi: 10.1038/s41598-025-14035-7.
Remote sensing object detection has recently emerged as one of the challenging topics in the field of deep learning applications due to the demand for both high detection performance and computational efficiency. To address these problems, this study introduces an efficient one-stage object detector that is designed mainly for detecting objects on remote sensing images, which consists of several innovations. Firstly, an extraction block is proposed called PRepConvBlock that leverages reparameterization convolution and partial feature utilization to effectively reduce the complexity in convolution operations, allowing for the utilization of larger kernel sizes in order to form the longer interactions between features and significantly expand receptive fields. Secondly, a unique shallow multi-scale fusion framework called SB-FPN based on Bi-FPN that utilizes the cross-interaction between shallow scale and deeper scale while inheriting the bidirectional connection from Bi-FPN to enhance the visual representation of features. Lastly, a Shallow-level Optimized Reparameterization Architecture Detector (SORA-DET) is proposed by applying several introduced innovations. This object detector is designed for UAV remote sensing object detection tasks that employ up to four detection heads. As a result, our proposed detector obtains a competitive performance that outperforms most of the other large-size models and SOTA works. In detail, the SORA-DET achieves 39.3% mAP50 in the VisDrone2019 test set while reaching up to 84.0% mAP50 in the SeaDroneSeeV2 validation set. Furthermore, our proposed detector is smaller than nearly 88.1% in parameters and has an inference speed of only 5.4 ms compared to other large-scale one-stage detectors.
由于对高检测性能和计算效率的需求,遥感目标检测最近已成为深度学习应用领域中具有挑战性的主题之一。为了解决这些问题,本研究引入了一种高效的单阶段目标检测器,该检测器主要设计用于检测遥感图像上的目标,它包含多项创新。首先,提出了一种名为PRepConvBlock的提取模块,该模块利用重参数化卷积和部分特征利用来有效降低卷积操作的复杂度,从而能够使用更大的内核大小,以便在特征之间形成更长的交互并显著扩大感受野。其次,基于Bi-FPN提出了一种独特的浅层多尺度融合框架SB-FPN,它在继承Bi-FPN双向连接的同时,利用浅层和深层尺度之间的交叉交互来增强特征的视觉表示。最后,通过应用多项引入的创新提出了一种浅层优化重参数化架构检测器(SORA-DET)。该目标检测器专为无人机遥感目标检测任务而设计,最多使用四个检测头。结果,我们提出的检测器获得了具有竞争力的性能,优于大多数其他大尺寸模型和最先进的方法。具体而言,SORA-DET在VisDrone2019测试集中达到了39.3%的mAP50,而在SeaDroneSeeV2验证集中达到了高达84.0%的mAP50。此外,与其他大规模单阶段检测器相比,我们提出的检测器参数减少了近88.1%,推理速度仅为5.4毫秒。