• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自主系统中的高级单目户外姿态估计:利用光流、深度估计和语义分割去除动态物体

Advanced Monocular Outdoor Pose Estimation in Autonomous Systems: Leveraging Optical Flow, Depth Estimation, and Semantic Segmentation with Dynamic Object Removal.

作者信息

Ghasemieh Alireza, Kashef Rasha

机构信息

Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada.

出版信息

Sensors (Basel). 2024 Dec 17;24(24):8040. doi: 10.3390/s24248040.

DOI:10.3390/s24248040
PMID:39771776
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11679697/
Abstract

Autonomous technologies have revolutionized transportation, military operations, and space exploration, necessitating precise localization in environments where traditional GPS-based systems are unreliable or unavailable. While widespread for outdoor localization, GPS systems face limitations in obstructed environments such as dense urban areas, forests, and indoor spaces. Moreover, GPS reliance introduces vulnerabilities to signal disruptions, which can lead to significant operational failures. Hence, developing alternative localization techniques that do not depend on external signals is essential, showing a critical need for robust, GPS-independent localization solutions adaptable to different applications, ranging from Earth-based autonomous vehicles to robotic missions on Mars. This paper addresses these challenges using Visual odometry (VO) to estimate a camera's pose by analyzing captured image sequences in GPS-denied areas tailored for autonomous vehicles (AVs), where safety and real-time decision-making are paramount. Extensive research has been dedicated to pose estimation using LiDAR or stereo cameras, which, despite their accuracy, are constrained by weight, cost, and complexity. In contrast, monocular vision is practical and cost-effective, making it a popular choice for drones, cars, and autonomous vehicles. However, robust and reliable monocular pose estimation models remain underexplored. This research aims to fill this gap by developing a novel adaptive framework for outdoor pose estimation and safe navigation using enhanced visual odometry systems with monocular cameras, especially for applications where deploying additional sensors is not feasible due to cost or physical constraints. This framework is designed to be adaptable across different vehicles and platforms, ensuring accurate and reliable pose estimation. We integrate advanced control theory to provide safety guarantees for motion control, ensuring that the AV can react safely to the imminent hazards and unknown trajectories of nearby traffic agents. The focus is on creating an AI-driven model(s) that meets the performance standards of multi-sensor systems while leveraging the inherent advantages of monocular vision. This research uses state-of-the-art machine learning techniques to advance visual odometry's technical capabilities and ensure its adaptability across different platforms, cameras, and environments. By merging cutting-edge visual odometry techniques with robust control theory, our approach enhances both the safety and performance of AVs in complex traffic situations, directly addressing the challenge of safe and adaptive navigation. Experimental results on the KITTI odometry dataset demonstrate a significant improvement in pose estimation accuracy, offering a cost-effective and robust solution for real-world applications.

摘要

自主技术已经彻底改变了交通运输、军事行动和太空探索,这使得在传统基于全球定位系统(GPS)的系统不可靠或无法使用的环境中进行精确定位成为必要。虽然GPS系统在户外定位中广泛应用,但在诸如密集城市地区、森林和室内空间等有遮挡的环境中,GPS系统面临着局限性。此外,对GPS的依赖会使系统容易受到信号干扰的影响,这可能导致重大的操作故障。因此,开发不依赖外部信号的替代定位技术至关重要,这表明迫切需要适用于不同应用的强大的、独立于GPS的定位解决方案,从地球上的自动驾驶车辆到火星上的机器人任务。本文通过使用视觉里程计(VO)来解决这些挑战,视觉里程计通过分析为自动驾驶车辆(AV)量身定制的GPS信号被阻断区域中捕获的图像序列来估计相机的姿态,在这些区域中,安全性和实时决策至关重要。已经有大量研究致力于使用激光雷达或立体相机进行姿态估计,尽管它们精度很高,但受到重量、成本和复杂性的限制。相比之下,单目视觉既实用又经济高效,使其成为无人机、汽车和自动驾驶车辆的热门选择。然而,强大且可靠的单目姿态估计模型仍未得到充分探索。本研究旨在通过开发一种新颖的自适应框架来填补这一空白,该框架使用配备单目相机的增强型视觉里程计系统进行户外姿态估计和安全导航,特别是针对由于成本或物理限制而无法部署额外传感器的应用。该框架设计为可跨不同车辆和平台进行适配,确保准确可靠的姿态估计。我们整合先进的控制理论为运动控制提供安全保障,确保自动驾驶车辆能够对附近交通代理的紧迫危险和未知轨迹做出安全反应。重点是创建一个符合多传感器系统性能标准的人工智能驱动模型,同时利用单目视觉的固有优势。本研究使用最先进的机器学习技术来提升视觉里程计的技术能力,并确保其在不同平台、相机和环境中的适应性。通过将前沿的视觉里程计技术与强大的控制理论相结合,我们的方法提高了自动驾驶车辆在复杂交通情况下的安全性和性能,直接应对了安全和自适应导航的挑战。在KITTI里程计数据集上的实验结果表明姿态估计精度有显著提高,为实际应用提供了一种经济高效且强大的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/b84912a434cb/sensors-24-08040-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/19114ae45d0d/sensors-24-08040-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/bcd8a02cd062/sensors-24-08040-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/ba056f3340a1/sensors-24-08040-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/4ea475812887/sensors-24-08040-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/ff5736e5e812/sensors-24-08040-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/986074e51479/sensors-24-08040-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/5c47c124e76e/sensors-24-08040-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/5392581eb058/sensors-24-08040-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/95b2722c4f38/sensors-24-08040-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/07731cb0864f/sensors-24-08040-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/0348fb05c9ed/sensors-24-08040-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/fa98fccbcf77/sensors-24-08040-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/b84912a434cb/sensors-24-08040-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/19114ae45d0d/sensors-24-08040-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/bcd8a02cd062/sensors-24-08040-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/ba056f3340a1/sensors-24-08040-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/4ea475812887/sensors-24-08040-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/ff5736e5e812/sensors-24-08040-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/986074e51479/sensors-24-08040-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/5c47c124e76e/sensors-24-08040-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/5392581eb058/sensors-24-08040-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/95b2722c4f38/sensors-24-08040-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/07731cb0864f/sensors-24-08040-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/0348fb05c9ed/sensors-24-08040-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/fa98fccbcf77/sensors-24-08040-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d82/11679697/b84912a434cb/sensors-24-08040-g013.jpg

相似文献

1
Advanced Monocular Outdoor Pose Estimation in Autonomous Systems: Leveraging Optical Flow, Depth Estimation, and Semantic Segmentation with Dynamic Object Removal.自主系统中的高级单目户外姿态估计:利用光流、深度估计和语义分割去除动态物体
Sensors (Basel). 2024 Dec 17;24(24):8040. doi: 10.3390/s24248040.
2
Semantic visual simultaneous localization and mapping (SLAM) using deep learning for dynamic scenes.使用深度学习的语义视觉同步定位与地图构建(SLAM)用于动态场景。
PeerJ Comput Sci. 2023 Oct 10;9:e1628. doi: 10.7717/peerj-cs.1628. eCollection 2023.
3
A Multi-Sensor Fusion MAV State Estimation from Long-Range Stereo, IMU, GPS and Barometric Sensors.基于远距离立体视觉、惯性测量单元、全球定位系统和气压传感器的多传感器融合微型飞行器状态估计
Sensors (Basel). 2016 Dec 22;17(1):11. doi: 10.3390/s17010011.
4
Visual Odometry Using Pixel Processor Arrays for Unmanned Aerial Systems in GPS Denied Environments.在全球定位系统(GPS)受限环境下,用于无人机系统的基于像素处理器阵列的视觉里程计
Front Robot AI. 2020 Sep 29;7:126. doi: 10.3389/frobt.2020.00126. eCollection 2020.
5
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
6
Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning.通过无监督深度学习进行立体视觉里程计位姿校正。
Sensors (Basel). 2021 Jul 11;21(14):4735. doi: 10.3390/s21144735.
7
Adaptive Monocular Visual-Inertial SLAM for Real-Time Augmented Reality Applications in Mobile Devices.适用于移动设备实时增强现实应用的自适应单目视觉惯性同步定位与地图构建
Sensors (Basel). 2017 Nov 7;17(11):2567. doi: 10.3390/s17112567.
8
6-DOF Pose Estimation of a Robotic Navigation Aid by Tracking Visual and Geometric Features.通过跟踪视觉和几何特征实现机器人导航辅助设备的六自由度姿态估计
IEEE Trans Autom Sci Eng. 2015 Oct;12(4):1169-1180. doi: 10.1109/TASE.2015.2469726. Epub 2015 Oct 5.
9
Sensor Fusion in Autonomous Vehicle with Traffic Surveillance Camera System: Detection, Localization, and AI Networking.自动驾驶车辆中的传感器融合:交通监测摄像系统中的检测、定位和人工智能网络。
Sensors (Basel). 2023 Mar 22;23(6):3335. doi: 10.3390/s23063335.
10
Semantic Evidential Grid Mapping Using Monocular and Stereo Cameras.使用单目和立体相机的语义证据网格映射
Sensors (Basel). 2021 May 12;21(10):3380. doi: 10.3390/s21103380.

引用本文的文献

1
Depth from 2D Images: Development and Metrological Evaluation of System Uncertainty Applied to Agricultural Scenarios.基于二维图像的深度测量:应用于农业场景的系统不确定性的发展与计量评估
Sensors (Basel). 2025 Jun 17;25(12):3790. doi: 10.3390/s25123790.

本文引用的文献

1
Joint estimation of pose, depth, and optical flow with a competition-cooperation transformer network.基于竞争-合作变换网络的位姿、深度和光流联合估计。
Neural Netw. 2024 Mar;171:263-275. doi: 10.1016/j.neunet.2023.12.020. Epub 2023 Dec 14.
2
Unifying Flow, Stereo and Depth Estimation.统一光流、立体视觉和深度估计
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13941-13958. doi: 10.1109/TPAMI.2023.3298645. Epub 2023 Oct 3.
3
Learning Dense and Continuous Optical Flow From an Event Camera.从事件相机学习密集和连续光流
IEEE Trans Image Process. 2022;31:7237-7251. doi: 10.1109/TIP.2022.3220938. Epub 2022 Nov 23.
4
An Unsupervised Monocular Visual Odometry Based on Multi-Scale Modeling.一种基于多尺度建模的无监督单目视觉里程计
Sensors (Basel). 2022 Jul 11;22(14):5193. doi: 10.3390/s22145193.
5
MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation.MSeg:用于多域语义分割的复合数据集。
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):796-810. doi: 10.1109/TPAMI.2022.3151200. Epub 2022 Dec 5.
6
Deep High-Resolution Representation Learning for Visual Recognition.用于视觉识别的深度高分辨率表征学习
IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686. Epub 2021 Sep 2.