• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于端到端语义场景文本特征的视觉场所识别

Visual place recognition from end-to-end semantic scene text features.

作者信息

Raisi Zobeir, Zelek John

机构信息

Electrical Engineering Department, Chabahar Maritime University, Chabahar, Iran.

Vision and Image Processing Laboratory, Systems Design Engineering Department, University of Waterloo, Waterloo, ON, Canada.

出版信息

Front Robot AI. 2024 Sep 16;11:1424883. doi: 10.3389/frobt.2024.1424883. eCollection 2024.

DOI:10.3389/frobt.2024.1424883
PMID:39350962
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11440043/
Abstract

We live in a visual world where text cues are abundant in urban environments. The premise for our work is for robots to capitalize on these text features for visual place recognition. A new technique is introduced that uses an end-to-end scene text detection and recognition technique to improve robot localization and mapping through Visual Place Recognition (VPR). This technique addresses several challenges such as arbitrary shaped text, illumination variation, and occlusion. The proposed model captures text strings and associated bounding boxes specifically designed for VPR tasks. The primary contribution of this work is the utilization of an end-to-end scene text spotting framework that can effectively capture irregular and occluded text in diverse environments. We conduct experimental evaluations on the Self-Collected TextPlace (SCTP) benchmark dataset, and our approach outperforms state-of-the-art methods in terms of precision and recall, which validates the effectiveness and potential of our proposed approach for VPR.

摘要

我们生活在一个视觉世界中,城市环境里文本线索丰富。我们这项工作的前提是让机器人利用这些文本特征进行视觉场所识别。本文介绍了一种新技术,该技术使用端到端场景文本检测与识别技术,通过视觉场所识别(VPR)来改进机器人定位与建图。这项技术解决了诸如任意形状文本、光照变化和遮挡等若干挑战。所提出的模型专门针对VPR任务捕捉文本字符串和相关边界框。这项工作的主要贡献在于利用了一种端到端场景文本检测框架,该框架能够在不同环境中有效捕捉不规则和被遮挡的文本。我们在自收集的TextPlace(SCTP)基准数据集上进行了实验评估,我们的方法在精度和召回率方面优于现有方法,这验证了我们所提出的VPR方法的有效性和潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/c68d263784c5/frobt-11-1424883-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/262bce18940f/frobt-11-1424883-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/d5c5e7db8089/frobt-11-1424883-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/3a3b7896cb0a/frobt-11-1424883-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/4b2f243e17fc/frobt-11-1424883-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/ab009646aa71/frobt-11-1424883-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/c68d263784c5/frobt-11-1424883-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/262bce18940f/frobt-11-1424883-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/d5c5e7db8089/frobt-11-1424883-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/3a3b7896cb0a/frobt-11-1424883-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/4b2f243e17fc/frobt-11-1424883-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/ab009646aa71/frobt-11-1424883-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/264c/11440043/c68d263784c5/frobt-11-1424883-g006.jpg

相似文献

1
Visual place recognition from end-to-end semantic scene text features.基于端到端语义场景文本特征的视觉场所识别
Front Robot AI. 2024 Sep 16;11:1424883. doi: 10.3389/frobt.2024.1424883. eCollection 2024.
2
Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images.连笔文本:用于自然场景图像中乌尔都语文本端到端识别的综合数据集。
Data Brief. 2020 May 21;31:105749. doi: 10.1016/j.dib.2020.105749. eCollection 2020 Aug.
3
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes.Mask TextSpotter:一种端到端可训练的神经网络,用于识别任意形状的文本。
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):532-548. doi: 10.1109/TPAMI.2019.2937086. Epub 2021 Jan 11.
4
TextBoxes++: A Single-Shot Oriented Scene Text Detector.TextBoxes++:一种单阶段的面向场景的文本检测器。
IEEE Trans Image Process. 2018 Aug;27(8):3676-3690. doi: 10.1109/TIP.2018.2825107. Epub 2018 Apr 9.
5
SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions.SVS-VPR:一种基于语义视觉和空间信息的分层视觉场所识别方法,用于在具有挑战性的环境条件下进行自主导航。
Sensors (Basel). 2024 Jan 30;24(3):906. doi: 10.3390/s24030906.
6
Towards End-to-End Text Spotting in Natural Scenes.面向自然场景的端到端文本检测。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7266-7281. doi: 10.1109/TPAMI.2021.3095916. Epub 2022 Sep 14.
7
LPMP: A Bio-Inspired Model for Visual Localization in Challenging Environments.LPMP:一种用于具有挑战性环境中视觉定位的仿生模型。
Front Robot AI. 2022 Feb 4;8:703811. doi: 10.3389/frobt.2021.703811. eCollection 2021.
8
An Appearance-Semantic Descriptor with Coarse-to-Fine Matching for Robust VPR.一种具有从粗到细匹配的外观语义描述符用于鲁棒视觉位置识别
Sensors (Basel). 2024 Mar 29;24(7):2203. doi: 10.3390/s24072203.
9
Boundary TextSpotter: Toward Arbitrary-Shaped Scene Text Spotting.边界文本检测:迈向任意形状场景文本检测
IEEE Trans Image Process. 2022;31:6200-6212. doi: 10.1109/TIP.2022.3206615. Epub 2022 Sep 28.
10
A Robot Object Recognition Method Based on Scene Text Reading in Home Environments.基于家庭环境中场景文本阅读的机器人目标识别方法。
Sensors (Basel). 2021 Mar 9;21(5):1919. doi: 10.3390/s21051919.

本文引用的文献

1
Cognition-Driven Structural Prior for Instance-Dependent Label Transition Matrix Estimation.用于实例相关标签转移矩阵估计的认知驱动结构先验
IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):3730-3743. doi: 10.1109/TNNLS.2023.3347633. Epub 2025 Feb 6.
2
TextSLAM: Visual SLAM With Semantic Planar Text Features.TextSLAM:基于语义平面文本特征的视觉同步定位与地图构建
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):593-610. doi: 10.1109/TPAMI.2023.3324320. Epub 2023 Dec 5.
3
ABCNet v2: Adaptive Bezier-Curve Network for Real-Time End-to-End Text Spotting.
ABCNet v2:用于实时端到端文本定位的自适应贝塞尔曲线网络。
IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):8048-8064. doi: 10.1109/TPAMI.2021.3107437. Epub 2022 Oct 4.
4
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification.ASTER:具有灵活矫正功能的注意场景文本识别器。
IEEE Trans Pattern Anal Mach Intell. 2019 Sep;41(9):2035-2048. doi: 10.1109/TPAMI.2018.2848939. Epub 2018 Jun 25.
5
TextBoxes++: A Single-Shot Oriented Scene Text Detector.TextBoxes++:一种单阶段的面向场景的文本检测器。
IEEE Trans Image Process. 2018 Aug;27(8):3676-3690. doi: 10.1109/TIP.2018.2825107. Epub 2018 Apr 9.
6
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition.NetVLAD:用于弱监督场景识别的卷积神经网络架构。
IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1437-1451. doi: 10.1109/TPAMI.2017.2711011. Epub 2017 Jun 1.
7
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.