Suppr超能文献

基于迁移学习的立体匹配算法在多场景机器人中的应用。

An application of stereo matching algorithm based on transfer learning on robots in multiple scenes.

作者信息

Bi Yuanwei, Li Chuanbiao, Tong Xiangrong, Wang Guohui, Sun Haiwei

机构信息

School of Computer Control and Engineering, Yantai University, Yantai, 264005, China.

出版信息

Sci Rep. 2023 Aug 6;13(1):12739. doi: 10.1038/s41598-023-39964-z.

Abstract

Robot vision technology based on binocular vision holds tremendous potential for development in various fields, including 3D scene reconstruction, target detection, and autonomous driving. However, current binocular vision methods used in robotics engineering have limitations such as high costs, complex algorithms, and low reliability of the generated disparity map in different scenes. To overcome these challenges, a cross-domain stereo matching algorithm for binocular vision based on transfer learning was proposed in this paper, named Cross-Domain Adaptation and Transfer Learning Network (Ct-Net), which has shown valuable results in multiple robot scenes. First, this paper introduces a General Feature Extractor to extract rich general feature information for domain adaptive stereo matching tasks. Then, a feature adapter is used to adapt the general features to the stereo matching network. Furthermore, a Domain Adaptive Cost Optimization Module is designed to optimize the matching cost. A disparity score prediction module was also embedded to adaptively adjust the search range of disparity and optimize the cost distribution. The overall framework was trained using a phased strategy, and ablation experiments were conducted to verify the effectiveness of the training strategy. Compared with the prototype PSMNet, on KITTI 2015 benchmark, the 3PE-fg of Ct-Net in all regions and non-occluded regions decreased by 19.3 and 21.1% respectively, meanwhile, on the Middlebury dataset, the proposed algorithm improves the sample error rate at least 28.4%, which is the Staircase sample. The quantitative and qualitative results obtained from Middlebury, Apollo, and other datasets demonstrate that Ct-Net significantly improves the cross-domain performance of stereo matching. Stereo matching experiments in real-world scenes have shown that it can effectively address visual tasks in multiple scenes.

摘要

基于双目视觉的机器人视觉技术在包括三维场景重建、目标检测和自动驾驶等各个领域都具有巨大的发展潜力。然而,目前机器人工程中使用的双目视觉方法存在一些局限性,如成本高、算法复杂以及在不同场景中生成的视差图可靠性低等问题。为了克服这些挑战,本文提出了一种基于迁移学习的双目视觉跨域立体匹配算法,名为跨域自适应与迁移学习网络(Ct-Net),该算法在多个机器人场景中取得了有价值的成果。首先,本文引入了一个通用特征提取器,用于提取丰富的通用特征信息,以进行域自适应立体匹配任务。然后,使用一个特征适配器将通用特征适配到立体匹配网络。此外,设计了一个域自适应成本优化模块来优化匹配成本。还嵌入了一个视差分数预测模块,以自适应地调整视差搜索范围并优化成本分布。整体框架采用分阶段策略进行训练,并进行了消融实验以验证训练策略的有效性。与原型PSMNet相比,在KITTI 2015基准测试中,Ct-Net在所有区域和非遮挡区域的3PE-fg分别下降了19.3%和21.1%,同时,在Middlebury数据集中,所提出的算法将样本错误率至少提高了28.4%,即阶梯样本。从中立数据集、阿波罗数据集和其他数据集中获得的定量和定性结果表明,Ct-Net显著提高了立体匹配的跨域性能。在真实场景中的立体匹配实验表明,它可以有效地解决多个场景中的视觉任务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/351a/10404586/3481dfad178d/41598_2023_39964_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验