Li Zhiqiang, Xiang Jian, Duan Jiawen
School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou, 310023, China.
Sci Rep. 2024 Nov 23;14(1):29058. doi: 10.1038/s41598-024-80265-w.
Current target detection methods perform well under normal lighting conditions; however, they encounter challenges in effectively extracting features, leading to false detections and missed detections in low illumination environments. To address these issues, this study introduces an efficient target detection method for low illumination, named DimNet. This method optimizes the model through enhancements in multi-scale feature fusion, feature extraction, detection head, and loss function. Firstly, efficient multi-scale feature fusion is performed by using a new neck structure in the original model so that it can fully exchange high-level semantic information and low-level spatial information. Secondly, by designing a new feature aggregation module, it can simultaneously fuse channel and spatial information as well as local and global information to improve the representation of the network. Subsequently, to achieve more accurate target recognition, a new detection head is designed by replacing the original convolutional layer and utilizing the reparameterization technique, which enhances recognition performance in complex scenes. Additionally, the size of the improved detection head is reduced by adopting a parameter-sharing approach, thereby balancing detection accuracy with computational efficiency. Finally, to solve the fuzzy boundary problem caused by the target boundary being similar to the surrounding background due to insufficient illumination under low illumination conditions, a new loss function is designed in this paper, which pays more attention to the center of the target and weakly considers the aspect ratio of the target prediction frame, and at the same time, the new loss function employs a dynamic gradient gain assignment strategy to reduce the effect of the low-quality anchor frames and to improve the target localization Accuracy. The experimental results show that DimNet achieves a mAP of 75.60% on the ExDark dataset, which is an improvement of 3.77% over the baseline model and 2.25% over the state-of-the-art (SOTA) model. DimNet outperforms the previous and current SOTA methods in terms of detection accuracy and other aspects of performance, which is a clear advantage.
当前的目标检测方法在正常光照条件下表现良好;然而,它们在有效提取特征方面面临挑战,导致在低光照环境中出现误检和漏检。为了解决这些问题,本研究提出了一种用于低光照的高效目标检测方法,名为DimNet。该方法通过在多尺度特征融合、特征提取、检测头和损失函数方面的改进来优化模型。首先,通过在原始模型中使用新的颈部结构进行高效的多尺度特征融合,使其能够充分交换高级语义信息和低级空间信息。其次,通过设计新的特征聚合模块,它可以同时融合通道和空间信息以及局部和全局信息,以提高网络的表示能力。随后,为了实现更准确的目标识别,通过替换原始卷积层并利用重参数化技术设计了一个新的检测头,这提高了在复杂场景中的识别性能。此外,采用参数共享方法减小了改进后检测头的尺寸,从而在检测精度和计算效率之间取得平衡。最后,为了解决低光照条件下由于光照不足导致目标边界与周围背景相似而引起的模糊边界问题,本文设计了一种新的损失函数,该函数更加关注目标中心,弱化对目标预测框宽高比的考虑,同时,新损失函数采用动态梯度增益分配策略来减少低质量锚框的影响并提高目标定位精度。实验结果表明,DimNet在ExDark数据集上的平均精度均值(mAP)达到75.60%,比基线模型提高了3.77%,比当前最优(SOTA)模型提高了2.25%。DimNet在检测精度和其他性能方面优于之前和当前的SOTA方法,具有明显优势。