城市环境中基于对象和视觉的定位

Object-Oriented and Visual-Based Localization in Urban Environments.

作者信息

Tsai Bo-Lung, Lin Kwei-Jay

机构信息

Department of Electrical Engineering and Computer Science, University of California, Irvine, CA 92697, USA.

College of Intelligent Computing, Chang Gung University, Taoyuan 333, Taiwan.

出版信息

Sensors (Basel). 2024 Mar 21;24(6):2014. doi: 10.3390/s24062014.

DOI:10.3390/s24062014

PMID:38544277

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10975017/

Abstract

In visual-based localization, prior research falls short in addressing challenges for the Internet of Things with limited computational resources. The dominant state-of-the-art models are based on separate feature extractors and descriptors without consideration of the constraints of small hardware, the issue of inconsistent image scale, or the presence of multi-objects. We introduce "OOPose", a real-time object-oriented pose estimation framework that leverages dense features from off-the-shelf object detection neural networks. It balances between pixel-matching accuracy and processing speed, enhancing overall performance. When input images share a comparable set of features, their matching accuracy is substantially heightened, while the reduction in image size facilitates faster processing but may compromise accuracy. OOPose resizes both the original library and cropped query object images to a width of 416 pixels. This adjustment results in a 2.4-fold improvement in pose accuracy and an 8.6-fold increase in processing speed. Moreover, OOPose eliminates the need for traditional sparse point extraction and description processes by capitalizing on dense network backbone features and selecting the detected query objects and sources of object library images, ensuring not only 1.3 times more accurate results but also three times greater stability compared to real-time sparse ORB matching algorithms. Beyond enhancements, we demonstrated the feasibility of OOPose in an autonomous mobile robot, enabling self-localization with a single camera at 10 FPS on a single CPU. It proves the cost-effectiveness and real-world applicability of OOPose for small embedded devices, setting the stage for potential markets and providing end-users with distinct advantages.

摘要

在基于视觉的定位中，先前的研究在应对计算资源有限的物联网挑战方面存在不足。主流的先进模型基于单独的特征提取器和描述符，没有考虑小型硬件的限制、图像尺度不一致的问题或多物体的存在。我们引入了“OOPose”，这是一个实时面向对象的姿态估计框架，它利用现成的目标检测神经网络的密集特征。它在像素匹配精度和处理速度之间取得平衡，提高了整体性能。当输入图像共享一组可比的特征时，它们的匹配精度会大幅提高，而图像尺寸的减小有助于更快地处理，但可能会影响精度。OOPose将原始库图像和裁剪后的查询目标图像都调整为宽度416像素。这种调整使姿态精度提高了2.4倍，处理速度提高了8.6倍。此外，OOPose通过利用密集的网络主干特征并选择检测到的查询目标和目标库图像的来源，消除了对传统稀疏点提取和描述过程的需求，与实时稀疏ORB匹配算法相比，不仅确保了结果的准确性提高1.3倍，而且稳定性提高了三倍。除了这些改进，我们还展示了OOPose在自主移动机器人中的可行性，使其能够在单个CPU上以10帧每秒的速度通过单个摄像头进行自我定位。这证明了OOPose对于小型嵌入式设备的成本效益和实际适用性，为潜在市场奠定了基础，并为终端用户提供了明显的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ac5/10975017/005f559a6c9e/sensors-24-02014-g001.jpg

相似文献

Object-Oriented and Visual-Based Localization in Urban Environments.城市环境中基于对象和视觉的定位

Sensors (Basel). 2024 Mar 21;24(6):2014. doi: 10.3390/s24062014.

A Manufacturing-Oriented Intelligent Vision System Based on Deep Neural Network for Object Recognition and 6D Pose Estimation.一种基于深度神经网络的面向制造的智能视觉系统，用于目标识别和6D位姿估计。

Front Neurorobot. 2021 Jan 7;14:616775. doi: 10.3389/fnbot.2020.616775. eCollection 2020.

3D Point Cloud Object Detection Method Based on Multi-Scale Dynamic Sparse Voxelization.基于多尺度动态稀疏体素化的三维点云目标检测方法

Sensors (Basel). 2024 Mar 11;24(6):1804. doi: 10.3390/s24061804.

Graph-DETR4D: Spatio-Temporal Graph Modeling for Multi-View 3D Object Detection.Graph-DETR4D：用于多视图3D目标检测的时空图建模

IEEE Trans Image Process. 2024;33:4488-4500. doi: 10.1109/TIP.2024.3430473. Epub 2024 Aug 21.

A Novel Framework for Image Matching and Stitching for Moving Car Inspection under Illumination Challenges.一种用于在光照挑战下移动汽车检测的图像匹配与拼接的新型框架。

Sensors (Basel). 2024 Feb 7;24(4):1083. doi: 10.3390/s24041083.

Object recognition in medical images via anatomy-guided deep learning.通过解剖学引导的深度学习实现医学图像中的目标识别。

Med Image Anal. 2022 Oct;81:102527. doi: 10.1016/j.media.2022.102527. Epub 2022 Jun 25.

Image partitioning and illumination in image-based pose detection for teleoperated flexible endoscopes.基于图像的远程操作柔性内窥镜位姿检测中的图像分区和光照。

Artif Intell Med. 2013 Nov;59(3):185-96. doi: 10.1016/j.artmed.2013.09.002. Epub 2013 Oct 10.

Analyzing the Impact of Objects in an Image on Location Estimation Accuracy in Visual Localization.分析图像中的物体对视觉定位中位置估计准确性的影响。

Sensors (Basel). 2024 Jan 26;24(3):816. doi: 10.3390/s24030816.

Learning Semantic-Aware Local Features for Long Term Visual Localization.学习用于长期视觉定位的语义感知局部特征。

IEEE Trans Image Process. 2022;31:4842-4855. doi: 10.1109/TIP.2022.3187565. Epub 2022 Jul 20.

Regression-Based Camera Pose Estimation through Multi-Level Local Features and Global Features.基于多层次局部特征和全局特征的回归相机位姿估计。

Sensors (Basel). 2023 Apr 18;23(8):4063. doi: 10.3390/s23084063.

本文引用的文献

CVANet: Cascaded visual attention network for single image super-resolution.CVANet：用于单图像超分辨率的级联视觉注意网络。

Neural Netw. 2024 Feb;170:622-634. doi: 10.1016/j.neunet.2023.11.049. Epub 2023 Nov 24.

MNGNAS: Distilling Adaptive Combination of Multiple Searched Networks for One-Shot Neural Architecture Search.MNGNAS：用于一次性神经架构搜索的多个搜索网络的自适应组合提取

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13489-13508. doi: 10.1109/TPAMI.2023.3293885. Epub 2023 Oct 3.

Improving Video Temporal Consistency via Broad Learning System.基于广谱学习系统的视频时间一致性改进。

IEEE Trans Cybern. 2022 Jul;52(7):6662-6675. doi: 10.1109/TCYB.2021.3079311. Epub 2022 Jul 4.

Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization?大规模 3D 模型真的有必要用于精确的视觉定位吗？

IEEE Trans Pattern Anal Mach Intell. 2021 Mar;43(3):814-829. doi: 10.1109/TPAMI.2019.2941876. Epub 2021 Feb 4.

A Study of Vicon System Positioning Performance.维康系统定位性能研究

Sensors (Basel). 2017 Jul 7;17(7):1591. doi: 10.3390/s17071591.

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition.NetVLAD：用于弱监督场景识别的卷积神经网络架构。

IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1437-1451. doi: 10.1109/TPAMI.2017.2711011. Epub 2017 Jun 1.

Efficient & Effective Prioritized Matching for Large-Scale Image-Based Localization.基于图像的大规模定位的高效有效优先级匹配。

IEEE Trans Pattern Anal Mach Intell. 2017 Sep;39(9):1744-1756. doi: 10.1109/TPAMI.2016.2611662. Epub 2016 Sep 20.

City-Scale Localization for Cameras with Known Vertical Direction.具有已知垂直方向的摄像机的城市级定位。

IEEE Trans Pattern Anal Mach Intell. 2017 Jul;39(7):1455-1461. doi: 10.1109/TPAMI.2016.2598331. Epub 2016 Aug 5.

SIFT flow: dense correspondence across scenes and its applications.SIFT 流：跨越场景的密集对应及其应用。

IEEE Trans Pattern Anal Mach Intell. 2011 May;33(5):978-94. doi: 10.1109/TPAMI.2010.147.

Faster and better: a machine learning approach to corner detection.更快更好：一种用于角点检测的机器学习方法。

IEEE Trans Pattern Anal Mach Intell. 2010 Jan;32(1):105-19. doi: 10.1109/TPAMI.2008.275.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

城市环境中基于对象和视觉的定位

Object-Oriented and Visual-Based Localization in Urban Environments.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献