Liu Xuesong, Chu Renxin, Liu Baolin
IEEE Trans Neural Netw Learn Syst. 2024 Sep 11;PP. doi: 10.1109/TNNLS.2024.3454063.
Detecting small signs in complex real-world environments remains challenging due to limited feature information and interference from other objects. In this article, we propose a novel text feature-guided network (TFG-Net) to improve the performance of the small signs detection not only enhancing the feature information of small signs but also avoiding the influence of other objects. As the name suggests, TFG-Net incorporates a text detection branch, which extracts additional textual features from the signs and supplies them to the object detection branch. Furthermore, the object detection branch of TFG-Net optimizes the backbone network's output structure by merging deep features and introducing a high-resolution feature layer. Finally, a fusion method that enhances both overall and local features is proposed to fully integrate detailed and semantic information. Experimental results display that our TFG-Net reaches the highest mean average precision (mAP) of 92.5% on the public datasets Tsinghua-Tencent 100K (TT100K), 83.7% on CCTSDB2021, and 79.1% on DFG, surpassing current state-of-the-art object detectors.
由于特征信息有限以及受到其他物体的干扰,在复杂的现实世界环境中检测小目标仍然具有挑战性。在本文中,我们提出了一种新颖的文本特征引导网络(TFG-Net),以提高小目标检测的性能,不仅增强小目标的特征信息,还避免其他物体的影响。顾名思义,TFG-Net包含一个文本检测分支,该分支从目标中提取额外的文本特征并将其提供给目标检测分支。此外,TFG-Net的目标检测分支通过合并深度特征和引入高分辨率特征层来优化骨干网络的输出结构。最后,提出了一种增强整体和局部特征的融合方法,以充分整合详细信息和语义信息。实验结果表明,我们的TFG-Net在公共数据集清华-腾讯100K(TT100K)上达到了最高平均精度(mAP)92.5%,在CCTSDB2021上达到了83.7%,在DFG上达到了79.1%,超过了当前最先进的目标检测器。