Zhu Jiangang, Ruan Yang, Jing Donglin, Fu Qiang, Ma Ting
School of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, China.
Shanghai Aerospace Control Technology Institute, Shanghai 201109, China.
Sensors (Basel). 2025 Feb 20;25(5):1285. doi: 10.3390/s25051285.
Conventional object detection methods face challenges in addressing the complexity of targets in optical remote sensing images (ORSIs), including multi-scale objects, high aspect ratios, and arbitrary orientations. This study proposes a novel detection framework called Progressive Self-Modulating Detector (PSMDet), which incorporates self-modulation mechanisms at the backbone, feature pyramid network (FPN), and detection head stages to address these issues. The backbone network utilizes a reparameterized large kernel network (RLK-Net) to enhance multi-scale feature extraction. At the same time, the adaptive perception network (APN) achieves accurate feature alignment through a self-attention mechanism. Additionally, a Gaussian-based bounding box representation and smooth relative entropy (smoothRE) regression loss are introduced to address traditional bounding box regression challenges, such as discontinuities and inconsistencies. Experimental validation on the HRSC2016 and UCAS-AOD datasets demonstrates the framework's robust performance, achieving the mean Average Precision (mAP) scores of 90.69% and 89.86%, respectively. Although validated on ORSIs, the proposed framework is adaptable for broader applications, such as autonomous driving in intelligent transportation systems and defect detection in industrial vision, where high-precision object detection is essential. These contributions provide theoretical and technical support for advancing intelligent image sensor-based applications across multiple domains.
传统的目标检测方法在处理光学遥感图像(ORSI)中目标的复杂性时面临挑战,包括多尺度目标、高宽比和任意方向。本研究提出了一种名为渐进式自调制检测器(PSMDet)的新型检测框架,该框架在主干、特征金字塔网络(FPN)和检测头阶段融入了自调制机制来解决这些问题。主干网络利用重新参数化的大内核网络(RLK-Net)来增强多尺度特征提取。同时,自适应感知网络(APN)通过自注意力机制实现精确的特征对齐。此外,引入了基于高斯的边界框表示和平滑相对熵(smoothRE)回归损失来应对传统边界框回归挑战,如不连续性和不一致性。在HRSC2016和UCAS-AOD数据集上的实验验证表明了该框架的强大性能,分别实现了90.69%和89.86%的平均精度均值(mAP)分数。尽管该框架是在ORSI上进行验证的,但它适用于更广泛的应用,如智能交通系统中的自动驾驶和工业视觉中的缺陷检测,在这些应用中高精度目标检测至关重要。这些贡献为推进跨多个领域的基于智能图像传感器的应用提供了理论和技术支持。