Li Yazhao, Pang Yanwei, Cao Jiale, Shen Jianbing, Shao Ling
IEEE Trans Image Process. 2021;30:2708-2721. doi: 10.1109/TIP.2020.3048630. Epub 2021 Feb 10.
Due to the advantages of real-time detection and improved performance, single-shot detectors have gained great attention recently. To solve the complex scale variations, single-shot detectors make scale-aware predictions based on multiple pyramid layers. Typically, small objects are detected on shallow layers while large objects are detected on deep layers. However, the features in the pyramid are not scale-aware enough, which limits the detection performance. Two common problems in single-shot detectors caused by object scale variations can be observed: (1) false negative problem, i.e., small objects are easily missed due to the weak features; (2) part-false positive problem, i.e., the salient part of a large object is sometimes detected as an object. With this observation, a new Neighbor Erasing and Transferring (NET) mechanism is proposed for feature scale-unmixing to explore scale-aware features in this paper. In NET, a Neighbor Erasing Module (NEM) is designed to erase the salient features of large objects and emphasize the features of small objects in shallow layers. A Neighbor Transferring Module (NTM) is introduced to transfer the erased features and highlight large objects in deep layers. With this mechanism, a single-shot network called NETNet is constructed for scale-aware object detection. In addition, we propose to aggregate nearest neighboring pyramid features to enhance our NET. Experiments on MS COCO dataset and UAVDT dataset demonstrate the effectiveness of our method. NETNet obtains 38.5% AP at a speed of 27 FPS and 32.0% AP at a speed of 55 FPS on MS COCO dataset. As a result, NETNet achieves a better trade-off for real-time and accurate object detection.
由于具有实时检测和性能提升的优点,单阶段检测器近来备受关注。为了解决复杂的尺度变化问题,单阶段检测器基于多个金字塔层进行尺度感知预测。通常,小物体在浅层被检测,而大物体在深层被检测。然而,金字塔中的特征尺度感知不够,这限制了检测性能。可以观察到单阶段检测器中由物体尺度变化引起的两个常见问题:(1)假阴性问题,即小物体由于特征较弱而容易被遗漏;(2)部分假阳性问题,即大物体的显著部分有时被检测为一个物体。基于此观察,本文提出了一种新的邻域擦除与转移(NET)机制用于特征尺度解混,以探索尺度感知特征。在NET中,设计了一个邻域擦除模块(NEM)来擦除大物体的显著特征并强调浅层中小物体的特征。引入了一个邻域转移模块(NTM)来转移被擦除的特征并在深层突出大物体。通过这种机制,构建了一个名为NETNet的单阶段网络用于尺度感知目标检测。此外,我们建议聚合最近邻金字塔特征以增强我们的NET。在MS COCO数据集和UAVDT数据集上的实验证明了我们方法的有效性。NETNet在MS COCO数据集上以27帧每秒的速度获得38.5%的平均精度,以55帧每秒的速度获得32.0%的平均精度。结果,NETNet在实时和准确目标检测方面实现了更好的权衡。