基于区域的卷积神经网络用于精确的目标检测和分割。

Region-Based Convolutional Networks for Accurate Object Detection and Segmentation.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2016 Jan;38(1):142-58. doi: 10.1109/TPAMI.2015.2437384.

DOI:10.1109/TPAMI.2015.2437384

Abstract

Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

摘要

目标检测性能，如在规范的 PASCAL VOC 挑战赛数据集上所测，在竞赛的最后几年达到了瓶颈。表现最好的方法是复杂的集成系统，这些系统通常将多个底层图像特征与高层上下文相结合。在本文中，我们提出了一种简单且可扩展的检测算法，与 VOC 2012 上的最佳结果相比，平均精度（mAP）提高了 50%以上，达到了 62.4%。我们的方法结合了两个思路：（1）可以将高容量卷积网络（CNNs）应用于自下而上的区域提议，以定位和分割对象；（2）当标记的训练数据稀缺时，辅助任务的监督预训练，然后是特定领域的微调，显著提高了性能。由于我们将区域提议与 CNN 相结合，因此我们将得到的模型称为 R-CNN 或基于区域的卷积网络。完整系统的源代码可在 http://www.cs.berkeley.edu/~rbg/rcnn 获得。

相似文献

Region-Based Convolutional Networks for Accurate Object Detection and Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2016 Jan;38(1):142-58. doi: 10.1109/TPAMI.2015.2437384.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Scale-Aware Pixelwise Object Proposal Networks.

IEEE Trans Image Process. 2016 Oct;25(10):4525-39. doi: 10.1109/TIP.2016.2593342. Epub 2016 Jul 19.

Weakly Supervised Object Detection via Object-Specific Pixel Gradient.

IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):5960-5970. doi: 10.1109/TNNLS.2018.2816021. Epub 2018 Apr 9.

Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection.

IEEE Trans Image Process. 2019 Jan;28(1):265-278. doi: 10.1109/TIP.2018.2867198.

Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):819-833. doi: 10.1109/TPAMI.2017.2700300. Epub 2017 May 2.

Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?

IEEE Trans Med Imaging. 2016 May;35(5):1299-1312. doi: 10.1109/TMI.2016.2535302. Epub 2016 Mar 7.

Feedback Convolutional Neural Network for Visual Localization and Segmentation.

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1627-1640. doi: 10.1109/TPAMI.2018.2843329. Epub 2018 Jun 4.

S-CNN: Subcategory-Aware Convolutional Networks for Object Detection.

IEEE Trans Pattern Anal Mach Intell. 2018 Oct;40(10):2522-2528. doi: 10.1109/TPAMI.2017.2756936. Epub 2017 Sep 26.

引用本文的文献

Deep learning-based seabird detection in fisheries for seabird protection.

J R Soc N Z. 2025 May 14;55(6):2082-2102. doi: 10.1080/03036758.2025.2500998. eCollection 2025.

Studying the performance of YOLOv11 incorporating DHSA BRA and PPA modules in railway track fasteners defect detection.

Sci Rep. 2025 Jul 29;15(1):27698. doi: 10.1038/s41598-025-13435-z.

Automated identification of sedimentary structures in core images using object detection algorithms.

PLoS One. 2025 Jul 18;20(7):e0327738. doi: 10.1371/journal.pone.0327738. eCollection 2025.

Robust angle-based transfer learning in high dimensions.

J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.

LKD-YOLOv8: A Lightweight Knowledge Distillation-Based Method for Infrared Object Detection.

Sensors (Basel). 2025 Jun 29;25(13):4054. doi: 10.3390/s25134054.

RST-YOLOv8: An Improved Chip Surface Defect Detection Model Based on YOLOv8.

Sensors (Basel). 2025 Jun 21;25(13):3859. doi: 10.3390/s25133859.

An enhanced YOLOv8 model for accurate detection of solid floating waste.

Sci Rep. 2025 Jul 11;15(1):25015. doi: 10.1038/s41598-025-10163-2.

Automatic identification of human spermatozoa with zona pellucida-binding capability using deep learning.

Hum Reprod Open. 2025 May 10;2025(3):hoaf024. doi: 10.1093/hropen/hoaf024. eCollection 2025.

DeepD&Cchl: an AI tool for automated 3D single-cell chloroplast detection, counting, and cell type clustering.

Front Plant Sci. 2025 May 23;16:1513953. doi: 10.3389/fpls.2025.1513953. eCollection 2025.

RecyBat24: a dataset for detecting lithium-ion batteries in electronic waste disposal.

Sci Data. 2025 May 22;12(1):843. doi: 10.1038/s41597-025-05211-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于区域的卷积神经网络用于精确的目标检测和分割。

Region-Based Convolutional Networks for Accurate Object Detection and Segmentation.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献