广义焦点损失：面向密集目标检测的有效表示学习。

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3139-3153. doi: 10.1109/TPAMI.2022.3180392. Epub 2023 Feb 3.

DOI:10.1109/TPAMI.2022.3180392

Abstract

Object detection is a fundamental computer vision task that simultaneously predicts the category and localization of the targets of interest. Recently one-stage (also termed "dense") detectors have gained much attention over two-stage ones due to their simple pipeline and friendly application to end devices. Dense object detectors basically formulate object detection as dense classification and localization (i.e., bounding box regression). The classification is usually optimized by Focal Loss and the box location is commonly learned under Dirac delta distribution. A recent trend for dense detectors is to introduce an individual prediction branch to estimate the quality of localization, which facilitates the classification to improve detection performance. This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization. Three problems are discovered in existing practices, including (1) the inconsistent usage of the quality estimation and classification between training and inference, (2) the inflexible Dirac delta distribution for localization, and (3) the deficient and implicit guidance for accurate quality estimation. To address these problems, we design new representations for these elements. Specifically, we merge the quality estimation into the class prediction vector to form a joint representation, use a vector to represent arbitrary distribution of box locations, and extract discriminant feature descriptors from the distribution vector for more reliable quality estimation. The improved representations eliminate the inconsistency risk and accurately depict the flexible distribution in real data, but contain continuous labels, which is beyond the scope of Focal Loss. We then propose Generalized Focal Loss (GFocal) that generalizes Focal Loss from its discrete form to the continuous version for successful optimization. Extensive experiments demonstrate the effectiveness of our method, without sacrificing the efficiency both in training and inference. Based on GFocal, we construct a considerably fast and lightweight detector termed NanoDet under mobile settings, which is 1.8 AP higher, 2x faster and 6x smaller than scaled YoloV4-Tiny.

摘要

目标检测是计算机视觉中的一项基本任务，它可以同时预测感兴趣目标的类别和位置。最近，由于其简单的流水线和对终端设备的友好应用，一阶段（也称为“密集型”）检测器受到了广泛关注，而两阶段检测器则相形见绌。密集目标检测器基本上将目标检测表述为密集分类和定位（即边界框回归）。分类通常通过焦点损失进行优化，而框位置通常根据狄拉克δ分布进行学习。密集型检测器的一个最新趋势是引入单独的预测分支来估计定位质量，这有助于分类提高检测性能。本文深入研究了上述三个基本要素的表示方法：质量估计、分类和定位。在现有实践中发现了三个问题，包括（1）在训练和推理中，质量估计和分类的使用不一致，（2）定位的狄拉克δ分布缺乏灵活性，（3）准确质量估计的指导不足且隐含。为了解决这些问题，我们为这些元素设计了新的表示方法。具体来说，我们将质量估计合并到类预测向量中，形成一个联合表示，使用向量来表示框位置的任意分布，并从分布向量中提取鉴别特征描述符，以进行更可靠的质量估计。改进的表示方法消除了不一致的风险，准确地描述了真实数据中的灵活分布，但包含连续标签，这超出了焦点损失的范围。然后，我们提出了广义焦点损失（GFocal），它将焦点损失从离散形式推广到连续形式，以便成功优化。广泛的实验证明了我们方法的有效性，同时在训练和推理效率方面都没有牺牲。基于 GFocal，我们在移动设置下构建了一个相当快速和轻量级的检测器 NanoDet，它比缩放后的 YoloV4-Tiny 高 1.8 AP，快 2 倍，小 6 倍。

相似文献

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.广义焦点损失：面向密集目标检测的有效表示学习。

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3139-3153. doi: 10.1109/TPAMI.2022.3180392. Epub 2023 Feb 3.

Interactive Regression and Classification for Dense Object Detector.密集目标检测器的交互式回归与分类

IEEE Trans Image Process. 2022;31:3684-3696. doi: 10.1109/TIP.2022.3174391. Epub 2022 May 26.

Training Robust Object Detectors From Noisy Category Labels and Imprecise Bounding Boxes.从噪声类别标签和不精确边界框中训练鲁棒目标检测器。

IEEE Trans Image Process. 2021;30:5782-5792. doi: 10.1109/TIP.2021.3085208. Epub 2021 Jun 23.

CCDet: Confidence-Consistent Learning for Dense Object Detection.CCDet：用于密集目标检测的置信度一致学习

IEEE Trans Image Process. 2024;33:2746-2758. doi: 10.1109/TIP.2024.3378457. Epub 2024 Apr 9.

Precision Detection of Dense Plums in Orchards Using the Improved YOLOv4 Model.基于改进YOLOv4模型的果园密集李子精确检测

Front Plant Sci. 2022 Mar 11;13:839269. doi: 10.3389/fpls.2022.839269. eCollection 2022.

Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.

Hierarchical Regression and Classification for Accurate Object Detection.用于精确目标检测的分层回归与分类

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2425-2439. doi: 10.1109/TNNLS.2021.3106641. Epub 2023 May 2.

IoU Regression with H+L-Sampling for Accurate Detection Confidence.IOU 回归与 H+L 采样的精准检测置信度。

Sensors (Basel). 2021 Jun 28;21(13):4433. doi: 10.3390/s21134433.

PDNet: Towards Better One-stage Object Detection with Prediction Decoupling.PDNet：通过预测解耦实现更好的单阶段目标检测

IEEE Trans Image Process. 2022 Jul 28;PP. doi: 10.1109/TIP.2022.3193223.

End-to-End Implicit Object Pose Estimation.端到端隐式物体姿态估计

Sensors (Basel). 2024 Sep 3;24(17):5721. doi: 10.3390/s24175721.

引用本文的文献

Recent advances in deep learning for lymphoma segmentation: Clinical applications and challenges.深度学习在淋巴瘤分割方面的最新进展：临床应用与挑战

Digit Health. 2025 Jul 28;11:20552076251362508. doi: 10.1177/20552076251362508. eCollection 2025 Jan-Dec.

Comparison of mask R-CNN and YOLOv8-seg for improved monitoring of the PCB surface during laser cleaning.用于改进激光清洗过程中印刷电路板表面监测的Mask R-CNN与YOLOv8-seg的比较。

Sci Rep. 2025 May 17;15(1):17185. doi: 10.1038/s41598-025-02131-7.

MADNet: Marine Animal Detection Network using the YOLO platform.

广义焦点损失：面向密集目标检测的有效表示学习。

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3139-3153. doi: 10.1109/TPAMI.2022.3180392. Epub 2023 Feb 3.

DOI:10.1109/TPAMI.2022.3180392

PMID:35679384

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

广义焦点损失：面向密集目标检测的有效表示学习。

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

出版信息

相似文献

引用本文的文献

广义焦点损失：面向密集目标检测的有效表示学习。

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

出版信息

相似文献

引用本文的文献