一种基于增强主干和优化机制的改进型更快区域卷积神经网络的多尺度目标检测方法

A Multi-Scale Target Detection Method Using an Improved Faster Region Convolutional Neural Network Based on Enhanced Backbone and Optimized Mechanisms.

作者信息

Chen Qianyong, Li Mengshan, Lai Zhenghui, Zhu Jihong, Guan Lixin

机构信息

College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China.

出版信息

J Imaging. 2024 Aug 13;10(8):197. doi: 10.3390/jimaging10080197.

DOI:10.3390/jimaging10080197

PMID:39194986

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11355408/

Abstract

Currently, existing deep learning methods exhibit many limitations in multi-target detection, such as low accuracy and high rates of false detection and missed detections. This paper proposes an improved Faster R-CNN algorithm, aiming to enhance the algorithm's capability in detecting multi-scale targets. This algorithm has three improvements based on Faster R-CNN. Firstly, the new algorithm uses the ResNet101 network for feature extraction of the detection image, which achieves stronger feature extraction capabilities. Secondly, the new algorithm integrates Online Hard Example Mining (OHEM), Soft non-maximum suppression (Soft-NMS), and Distance Intersection Over Union (DIOU) modules, which improves the positive and negative sample imbalance and the problem of small targets being easily missed during model training. Finally, the Region Proposal Network (RPN) is simplified to achieve a faster detection speed and a lower miss rate. The multi-scale training (MST) strategy is also used to train the improved Faster R-CNN to achieve a balance between detection accuracy and efficiency. Compared to the other detection models, the improved Faster R-CNN demonstrates significant advantages in terms of mAP@0.5, F1-score, and Log average miss rate (LAMR). The model proposed in this paper provides valuable insights and inspiration for many fields, such as smart agriculture, medical diagnosis, and face recognition.

摘要

目前，现有的深度学习方法在多目标检测中存在许多局限性，如准确率低、误检率和漏检率高。本文提出了一种改进的Faster R-CNN算法，旨在提高该算法检测多尺度目标的能力。该算法在Faster R-CNN的基础上有三点改进。首先，新算法使用ResNet101网络对检测图像进行特征提取，实现了更强的特征提取能力。其次，新算法集成了在线困难样本挖掘（OHEM）、软非极大值抑制（Soft-NMS）和距离交并比（DIOU）模块，改善了正负样本不平衡以及模型训练过程中小目标容易被漏检的问题。最后，简化了区域建议网络（RPN），以实现更快的检测速度和更低的漏检率。还采用了多尺度训练（MST）策略来训练改进后的Faster R-CNN，以实现检测精度和效率之间的平衡。与其他检测模型相比，改进后的Faster R-CNN在mAP@0.5、F1分数和对数平均漏检率（LAMR）方面表现出显著优势。本文提出的模型为智能农业、医学诊断和人脸识别等许多领域提供了有价值的见解和启发。