• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

R-YOLO:一种用于任意旋转自然场景的实时文本检测器。

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation.

机构信息

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China.

Lancaster Environment Centre, Lancaster University, Lancaster LA1 4YQ, UK.

出版信息

Sensors (Basel). 2021 Jan 28;21(3):888. doi: 10.3390/s21030888.

DOI:10.3390/s21030888
PMID:33525619
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7865800/
Abstract

Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.

摘要

准确而高效的自然场景文本检测是计算机视觉中的一项基本而具有挑战性的任务,尤其是在处理任意方向的文本时。大多数现代文本检测方法旨在识别水平或近似水平的文本,这无法满足各种现实世界图像(如图像流或视频)的实际检测要求。为了解决这个问题,我们提出了一种名为旋转 YOLO(R-YOLO)的新方法,这是一种强大的实时卷积神经网络(CNN)模型,可用于检测自然图像场景中的任意方向文本。首先,使用带有角度信息的旋转锚框作为文本边界框,以适应各种方向。其次,从输入图像中提取各种尺度的特征,以确定文本的概率、置信度和倾斜边界框。最后,使用旋转交并比非极大值抑制来消除冗余并获得具有最高精度的检测结果。我们在四个流行的数据集(ICDAR2015、ICDAR2013、MSRA-TD500 和 ICDAR2017-MLT)上进行了基准比较实验。结果表明,与最先进的方法相比,所提出的 R-YOLO 方法在保持高精度的同时显著提高了检测效率;例如,在 ICDAR2015 数据集上,以 720p 分辨率和 62.5 fps 的帧率,所提出的 R-YOLO 方法的 F 值达到了 82.3%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/ebb6e8853ff6/sensors-21-00888-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/b13b92142b90/sensors-21-00888-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/fb61cb5d9b0f/sensors-21-00888-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/9d8b24cd35fc/sensors-21-00888-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/79ddd20c939c/sensors-21-00888-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/540cebffe847/sensors-21-00888-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/708489f500bb/sensors-21-00888-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/e1a3420dfdc8/sensors-21-00888-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/2dfbc2cdbdc3/sensors-21-00888-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/f0ce9c7dad50/sensors-21-00888-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/c4640642a7b8/sensors-21-00888-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/af1f4d13858a/sensors-21-00888-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/7b89e50d59e7/sensors-21-00888-g012a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/ebb6e8853ff6/sensors-21-00888-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/b13b92142b90/sensors-21-00888-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/fb61cb5d9b0f/sensors-21-00888-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/9d8b24cd35fc/sensors-21-00888-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/79ddd20c939c/sensors-21-00888-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/540cebffe847/sensors-21-00888-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/708489f500bb/sensors-21-00888-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/e1a3420dfdc8/sensors-21-00888-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/2dfbc2cdbdc3/sensors-21-00888-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/f0ce9c7dad50/sensors-21-00888-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/c4640642a7b8/sensors-21-00888-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/af1f4d13858a/sensors-21-00888-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/7b89e50d59e7/sensors-21-00888-g012a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50f1/7865800/ebb6e8853ff6/sensors-21-00888-g013.jpg

相似文献

1
R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation.R-YOLO:一种用于任意旋转自然场景的实时文本检测器。
Sensors (Basel). 2021 Jan 28;21(3):888. doi: 10.3390/s21030888.
2
A real-time arbitrary-shape text detector.实时任意形状文本检测器。
PLoS One. 2024 Apr 16;19(4):e0302234. doi: 10.1371/journal.pone.0302234. eCollection 2024.
3
TextField: Learning a Deep Direction Field for Irregular Scene Text Detection.文本字段:学习用于不规则场景文本检测的深度方向场。
IEEE Trans Image Process. 2019 Nov;28(11):5566-5579. doi: 10.1109/TIP.2019.2900589. Epub 2019 Feb 21.
4
Arbitrary Shape Text Detection via Segmentation With Probability Maps.基于概率图分割的任意形状文本检测。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):2736-2750. doi: 10.1109/TPAMI.2022.3176122. Epub 2023 Feb 3.
5
ACE: Anchor-Free Corner Evolution for Real-Time Arbitrarily-Oriented Object Detection.ACE:用于实时任意方向目标检测的无锚点角点演化
IEEE Trans Image Process. 2022;31:4076-4089. doi: 10.1109/TIP.2022.3167919. Epub 2022 Jun 17.
6
Efficient algorithm for directed text detection based on rotation decoupled bounding box.基于旋转解耦边界框的高效定向文本检测算法。
PeerJ Comput Sci. 2023 May 9;9:e1352. doi: 10.7717/peerj-cs.1352. eCollection 2023.
7
Irregular Scene Text Detection Based on a Graph Convolutional Network.基于图卷积网络的不规则场景文本检测。
Sensors (Basel). 2023 Jan 17;23(3):1070. doi: 10.3390/s23031070.
8
Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm.基于期望最大化算法的混合监督场景文本检测
IEEE Trans Image Process. 2022;31:5513-5528. doi: 10.1109/TIP.2022.3197987. Epub 2022 Aug 22.
9
Towards toxic and narcotic medication detection with rotated object detectors.利用旋转目标检测器进行有毒和麻醉药物检测
Quant Imaging Med Surg. 2023 Apr 1;13(4):2156-2166. doi: 10.21037/qims-21-1146. Epub 2023 Feb 24.
10
DB-YOLO: A Duplicate Bilateral YOLO Network for Multi-Scale Ship Detection in SAR Images.DB-YOLO:一种用于 SAR 图像中多尺度船舶检测的重复双边 YOLO 网络。
Sensors (Basel). 2021 Dec 6;21(23):8146. doi: 10.3390/s21238146.

引用本文的文献

1
A real-time arbitrary-shape text detector.实时任意形状文本检测器。
PLoS One. 2024 Apr 16;19(4):e0302234. doi: 10.1371/journal.pone.0302234. eCollection 2024.
2
Irregular Scene Text Detection Based on a Graph Convolutional Network.基于图卷积网络的不规则场景文本检测。
Sensors (Basel). 2023 Jan 17;23(3):1070. doi: 10.3390/s23031070.
3
SEMPANet: A Modified Path Aggregation Network with Squeeze-Excitation for Scene Text Detection.SEMPA 网络:一种具有挤压激励的改进路径聚合网络,用于场景文本检测。

本文引用的文献

1
Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion.基于可微二值化和自适应尺度融合的实时场景文本检测
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):919-931. doi: 10.1109/TPAMI.2022.3155612. Epub 2022 Dec 5.
2
TextBoxes++: A Single-Shot Oriented Scene Text Detector.TextBoxes++:一种单阶段的面向场景的文本检测器。
IEEE Trans Image Process. 2018 Aug;27(8):3676-3690. doi: 10.1109/TIP.2018.2825107. Epub 2018 Apr 9.
3
Robust Text Detection in Natural Scene Images.自然场景图像中的鲁棒文本检测。
Sensors (Basel). 2021 Apr 9;21(8):2657. doi: 10.3390/s21082657.
IEEE Trans Pattern Anal Mach Intell. 2014 May;36(5):970-83. doi: 10.1109/TPAMI.2013.182.
4
A hybrid approach to detect and localize texts in natural scene images.一种用于检测和定位自然场景图像中文本的混合方法。
IEEE Trans Image Process. 2011 Mar;20(3):800-13. doi: 10.1109/TIP.2010.2070803. Epub 2010 Sep 2.