• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

广义焦点损失:面向密集目标检测的有效表示学习。

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3139-3153. doi: 10.1109/TPAMI.2022.3180392. Epub 2023 Feb 3.

DOI:10.1109/TPAMI.2022.3180392
PMID:35679384
Abstract

Object detection is a fundamental computer vision task that simultaneously predicts the category and localization of the targets of interest. Recently one-stage (also termed "dense") detectors have gained much attention over two-stage ones due to their simple pipeline and friendly application to end devices. Dense object detectors basically formulate object detection as dense classification and localization (i.e., bounding box regression). The classification is usually optimized by Focal Loss and the box location is commonly learned under Dirac delta distribution. A recent trend for dense detectors is to introduce an individual prediction branch to estimate the quality of localization, which facilitates the classification to improve detection performance. This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization. Three problems are discovered in existing practices, including (1) the inconsistent usage of the quality estimation and classification between training and inference, (2) the inflexible Dirac delta distribution for localization, and (3) the deficient and implicit guidance for accurate quality estimation. To address these problems, we design new representations for these elements. Specifically, we merge the quality estimation into the class prediction vector to form a joint representation, use a vector to represent arbitrary distribution of box locations, and extract discriminant feature descriptors from the distribution vector for more reliable quality estimation. The improved representations eliminate the inconsistency risk and accurately depict the flexible distribution in real data, but contain continuous labels, which is beyond the scope of Focal Loss. We then propose Generalized Focal Loss (GFocal) that generalizes Focal Loss from its discrete form to the continuous version for successful optimization. Extensive experiments demonstrate the effectiveness of our method, without sacrificing the efficiency both in training and inference. Based on GFocal, we construct a considerably fast and lightweight detector termed NanoDet under mobile settings, which is 1.8 AP higher, 2x faster and 6x smaller than scaled YoloV4-Tiny.

摘要

目标检测是计算机视觉中的一项基本任务,它可以同时预测感兴趣目标的类别和位置。最近,由于其简单的流水线和对终端设备的友好应用,一阶段(也称为“密集型”)检测器受到了广泛关注,而两阶段检测器则相形见绌。密集目标检测器基本上将目标检测表述为密集分类和定位(即边界框回归)。分类通常通过焦点损失进行优化,而框位置通常根据狄拉克δ分布进行学习。密集型检测器的一个最新趋势是引入单独的预测分支来估计定位质量,这有助于分类提高检测性能。本文深入研究了上述三个基本要素的表示方法:质量估计、分类和定位。在现有实践中发现了三个问题,包括(1)在训练和推理中,质量估计和分类的使用不一致,(2)定位的狄拉克δ分布缺乏灵活性,(3)准确质量估计的指导不足且隐含。为了解决这些问题,我们为这些元素设计了新的表示方法。具体来说,我们将质量估计合并到类预测向量中,形成一个联合表示,使用向量来表示框位置的任意分布,并从分布向量中提取鉴别特征描述符,以进行更可靠的质量估计。改进的表示方法消除了不一致的风险,准确地描述了真实数据中的灵活分布,但包含连续标签,这超出了焦点损失的范围。然后,我们提出了广义焦点损失(GFocal),它将焦点损失从离散形式推广到连续形式,以便成功优化。广泛的实验证明了我们方法的有效性,同时在训练和推理效率方面都没有牺牲。基于 GFocal,我们在移动设置下构建了一个相当快速和轻量级的检测器 NanoDet,它比缩放后的 YoloV4-Tiny 高 1.8 AP,快 2 倍,小 6 倍。

相似文献

1
Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.广义焦点损失:面向密集目标检测的有效表示学习。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3139-3153. doi: 10.1109/TPAMI.2022.3180392. Epub 2023 Feb 3.
2
Interactive Regression and Classification for Dense Object Detector.密集目标检测器的交互式回归与分类
IEEE Trans Image Process. 2022;31:3684-3696. doi: 10.1109/TIP.2022.3174391. Epub 2022 May 26.
3
Training Robust Object Detectors From Noisy Category Labels and Imprecise Bounding Boxes.从噪声类别标签和不精确边界框中训练鲁棒目标检测器。
IEEE Trans Image Process. 2021;30:5782-5792. doi: 10.1109/TIP.2021.3085208. Epub 2021 Jun 23.
4
CCDet: Confidence-Consistent Learning for Dense Object Detection.CCDet:用于密集目标检测的置信度一致学习
IEEE Trans Image Process. 2024;33:2746-2758. doi: 10.1109/TIP.2024.3378457. Epub 2024 Apr 9.
5
Precision Detection of Dense Plums in Orchards Using the Improved YOLOv4 Model.基于改进YOLOv4模型的果园密集李子精确检测
Front Plant Sci. 2022 Mar 11;13:839269. doi: 10.3389/fpls.2022.839269. eCollection 2022.
6
Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.
7
Hierarchical Regression and Classification for Accurate Object Detection.用于精确目标检测的分层回归与分类
IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2425-2439. doi: 10.1109/TNNLS.2021.3106641. Epub 2023 May 2.
8
IoU Regression with H+L-Sampling for Accurate Detection Confidence.IOU 回归与 H+L 采样的精准检测置信度。
Sensors (Basel). 2021 Jun 28;21(13):4433. doi: 10.3390/s21134433.
9
PDNet: Towards Better One-stage Object Detection with Prediction Decoupling.PDNet:通过预测解耦实现更好的单阶段目标检测
IEEE Trans Image Process. 2022 Jul 28;PP. doi: 10.1109/TIP.2022.3193223.
10
End-to-End Implicit Object Pose Estimation.端到端隐式物体姿态估计
Sensors (Basel). 2024 Sep 3;24(17):5721. doi: 10.3390/s24175721.

引用本文的文献

1
Recent advances in deep learning for lymphoma segmentation: Clinical applications and challenges.深度学习在淋巴瘤分割方面的最新进展:临床应用与挑战
Digit Health. 2025 Jul 28;11:20552076251362508. doi: 10.1177/20552076251362508. eCollection 2025 Jan-Dec.
2
Comparison of mask R-CNN and YOLOv8-seg for improved monitoring of the PCB surface during laser cleaning.用于改进激光清洗过程中印刷电路板表面监测的Mask R-CNN与YOLOv8-seg的比较。
Sci Rep. 2025 May 17;15(1):17185. doi: 10.1038/s41598-025-02131-7.
3
MADNet: Marine Animal Detection Network using the YOLO platform.
MADNet:使用YOLO平台的海洋动物检测网络。
PLoS One. 2025 May 8;20(5):e0322799. doi: 10.1371/journal.pone.0322799. eCollection 2025.
4
Research on Innovative Apple Grading Technology Driven by Intelligent Vision and Machine Learning.基于智能视觉和机器学习驱动的苹果创新分级技术研究
Foods. 2025 Jan 15;14(2):258. doi: 10.3390/foods14020258.
5
Enhancing physician support in pancreatic cancer diagnosis: New M-F-RCNN artificial intelligence model using endoscopic ultrasound.增强胰腺癌诊断中的医生支持:使用内镜超声的新型M-F-RCNN人工智能模型
Endosc Int Open. 2024 Nov 7;12(11):E1277-E1284. doi: 10.1055/a-2422-9214. eCollection 2024 Nov.
6
Automated segmentation and source prediction of bone tumors using ConvNeXtv2 Fusion based Mask R-CNN to identify lung cancer metastasis.使用基于ConvNeXtv2融合的Mask R-CNN对骨肿瘤进行自动分割和来源预测,以识别肺癌转移。
J Bone Oncol. 2024 Sep 26;48:100637. doi: 10.1016/j.jbo.2024.100637. eCollection 2024 Oct.
7
scHiCyclePred: a deep learning framework for predicting cell cycle phases from single-cell Hi-C data using multi-scale interaction information.scHiCyclePred:一种基于深度学习的框架,用于使用多尺度相互作用信息从单细胞 Hi-C 数据中预测细胞周期阶段。
Commun Biol. 2024 Jul 31;7(1):923. doi: 10.1038/s42003-024-06626-3.
8
Enhanced Water Surface Object Detection with Dynamic Task-Aligned Sample Assignment and Attention Mechanisms.基于动态任务对齐样本分配和注意力机制的增强型水面目标检测
Sensors (Basel). 2024 May 14;24(10):3104. doi: 10.3390/s24103104.
9
Application of Machine Vision Techniques in Low-Cost Devices to Improve Efficiency in Precision Farming.机器视觉技术在低成本设备中的应用以提高精准农业效率。
Sensors (Basel). 2024 Jan 31;24(3):937. doi: 10.3390/s24030937.
10
A simplified network topology for fruit detection, counting and mobile-phone deployment.用于水果检测、计数和手机部署的简化网络拓扑结构。
PLoS One. 2023 Oct 9;18(10):e0292600. doi: 10.1371/journal.pone.0292600. eCollection 2023.