• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于目标检测的可变形部件区域学习与特征聚合树表示

Deformable Part Region Learning and Feature Aggregation Tree Representation for Object Detection.

作者信息

Bae Seung-Hwan

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):10817-10834. doi: 10.1109/TPAMI.2023.3268864. Epub 2023 Aug 7.

DOI:10.1109/TPAMI.2023.3268864
PMID:37079404
Abstract

Region-based object detection infers object regions for one or more categories in an image. Due to the recent advances in deep learning and region proposal methods, object detectors based on convolutional neural networks (CNNs) have been flourishing and provided promising detection results. However, the accuracy of the convolutional object detectors can be degraded often due to the low feature discriminability caused by geometric variation or transformation of an object. In this article, we propose a deformable part region (DPR) learning in order to allow decomposed part regions to be deformable according to the geometric transformation of an object. Because the ground truth of the part models is not available in many cases, we design part model losses for the detection and segmentation, and learn the geometric parameters by minimizing an integral loss including those part losses. As a result, we can train our DPR network without extra supervision, and make multi-part models deformable according to object geometric variation. Moreover, we propose a novel feature aggregation tree (FAT) so as to learn more discriminative region of interest (RoI) features via bottom-up tree construction. The FAT can learn the stronger semantic features by aggregating part RoI features along the bottom-up pathways of the tree. We also present a spatial and channel attention mechanism for the aggregation between different node features. Based on the proposed DPR and FAT networks, we design a new cascade architecture that can refine detection tasks iteratively. Without bells and whistles, we achieve impressive detection and segmentation results on MSCOCO and PASCAL VOC datasets. Our Cascade D-PRD achieves the 57.9 box AP with the Swin-L backbone. We also provide an extensive ablation study to prove the effectiveness and usefulness of the proposed methods for large-scale object detection.

摘要

基于区域的目标检测可推断图像中一个或多个类别的目标区域。由于深度学习和区域提议方法的最新进展,基于卷积神经网络(CNN)的目标检测器蓬勃发展,并取得了令人瞩目的检测结果。然而,卷积目标检测器的准确性常常会因目标的几何变化或变换导致的特征可辨别性低而降低。在本文中,我们提出了一种可变形部件区域(DPR)学习方法,以使分解后的部件区域能够根据目标的几何变换而变形。由于在许多情况下部件模型的真实标注不可用,我们设计了用于检测和分割的部件模型损失,并通过最小化包含这些部件损失的积分损失来学习几何参数。结果,我们可以在无需额外监督的情况下训练我们的DPR网络,并使多部件模型根据目标几何变化而变形。此外,我们提出了一种新颖的特征聚合树(FAT),以便通过自底向上的树构建来学习更具辨别力的感兴趣区域(RoI)特征。FAT可以通过沿树的自底向上路径聚合部件RoI特征来学习更强的语义特征。我们还提出了一种空间和通道注意力机制,用于不同节点特征之间的聚合。基于所提出的DPR和FAT网络,我们设计了一种新的级联架构,该架构可以迭代地细化检测任务。无需复杂的技巧,我们在MSCOCO和PASCAL VOC数据集上取得了令人印象深刻的检测和分割结果。我们的级联D-PRD使用Swin-L主干网络实现了57.9的框AP。我们还进行了广泛的消融研究,以证明所提出的方法在大规模目标检测中的有效性和实用性。

相似文献

1
Deformable Part Region Learning and Feature Aggregation Tree Representation for Object Detection.用于目标检测的可变形部件区域学习与特征聚合树表示
IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):10817-10834. doi: 10.1109/TPAMI.2023.3268864. Epub 2023 Aug 7.
2
HROM: Learning High-Resolution Representation and Object-Aware Masks for Visual Object Tracking.HROM:用于视觉目标跟踪的学习高分辨率表示和对象感知掩模。
Sensors (Basel). 2020 Aug 26;20(17):4807. doi: 10.3390/s20174807.
3
Proposal-Free Fully Convolutional Network: Object Detection Based on a Box Map.无提议全卷积网络:基于框映射的目标检测
Sensors (Basel). 2024 May 30;24(11):3529. doi: 10.3390/s24113529.
4
Rethinking Attentive Object Detection via Neural Attention Learning.通过神经注意力学习重新思考注意力目标检测
IEEE Trans Image Process. 2024;33:1726-1739. doi: 10.1109/TIP.2023.3251693. Epub 2024 Mar 7.
5
Object Detection Based on Swin Deformable Transformer-BiPAFPN-YOLOX.基于 Swin 变形 Transformer-BiPAFPN-YOLOX 的目标检测。
Comput Intell Neurosci. 2023 Mar 9;2023:4228610. doi: 10.1155/2023/4228610. eCollection 2023.
6
DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection.深度身份网络:用于目标检测的可变形深度卷积神经网络
IEEE Trans Pattern Anal Mach Intell. 2017 Jul;39(7):1320-1334. doi: 10.1109/TPAMI.2016.2587642. Epub 2016 Jul 7.
7
Single-Shot Object Detection via Feature Enhancement and Channel Attention.基于特征增强与通道注意力的单阶段目标检测
Sensors (Basel). 2022 Sep 10;22(18):6857. doi: 10.3390/s22186857.
8
A new multi-scale backbone network for object detection based on asymmetric convolutions.基于非对称卷积的目标检测新的多尺度骨干网络。
Sci Prog. 2021 Apr-Jun;104(2):368504211011343. doi: 10.1177/00368504211011343.
9
Real-Time Object Detection With Reduced Region Proposal Network via Multi-Feature Concatenation.基于多特征拼接的减少区域提案网络的实时目标检测。
IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):2164-2173. doi: 10.1109/TNNLS.2019.2929059. Epub 2019 Aug 21.
10
Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection.深度区域块:用于通用目标检测的混合表示与深度学习
IEEE Trans Pattern Anal Mach Intell. 2021 Jun;43(6):1914-1927. doi: 10.1109/TPAMI.2019.2957780. Epub 2021 May 11.

引用本文的文献

1
Efficient underwater object detection based on feature enhancement and attention detection head.基于特征增强和注意力检测头的高效水下目标检测
Sci Rep. 2025 Feb 18;15(1):5973. doi: 10.1038/s41598-025-89421-2.
2
A Comprehensive Survey of Machine Learning Techniques and Models for Object Detection.目标检测的机器学习技术与模型综合调查
Sensors (Basel). 2025 Jan 2;25(1):214. doi: 10.3390/s25010214.