Zheng Linxin, Xiao Guobao, Shi Ziwei, Wang Shiping, Ma Jiayi
IEEE Trans Image Process. 2022;31:4598-4608. doi: 10.1109/TIP.2022.3186535. Epub 2022 Jul 12.
In this paper, we propose a novel multi-scale attention based network (called MSA-Net) for feature matching problems. Current deep networks based feature matching methods suffer from limited effectiveness and robustness when applied to different scenarios, due to random distributions of outliers and insufficient information learning. To address this issue, we propose a multi-scale attention block to enhance the robustness to outliers, for improving the representational ability of the feature map. In addition, we also design a novel context channel refine block and a context spatial refine block to mine the information context with less parameters along channel and spatial dimensions, respectively. The proposed MSA-Net is able to effectively infer the probability of correspondences being inliers with less parameters. Extensive experiments on outlier removal and relative pose estimation have shown the performance improvements of our network over current state-of-the-art methods with less parameters on both outdoor and indoor datasets. Notably, our proposed network achieves an 11.7% improvement at error threshold 5° without RANSAC than the state-of-the-art method on relative pose estimation task when trained on YFCC100M dataset.
在本文中,我们针对特征匹配问题提出了一种新颖的基于多尺度注意力的网络(称为MSA-Net)。当前基于深度网络的特征匹配方法在应用于不同场景时,由于异常值的随机分布和信息学习不足,其有效性和鲁棒性有限。为了解决这个问题,我们提出了一个多尺度注意力模块来增强对异常值的鲁棒性,以提高特征图的表征能力。此外,我们还设计了一个新颖的上下文通道细化模块和一个上下文空间细化模块,分别沿着通道和空间维度以较少的参数挖掘信息上下文。所提出的MSA-Net能够以较少的参数有效地推断对应关系为内点的概率。在去除异常值和相对位姿估计方面的大量实验表明,在室外和室内数据集上,我们的网络在参数较少的情况下比当前的最先进方法具有更好的性能。值得注意的是,当在YFCC100M数据集上训练时,我们提出的网络在相对位姿估计任务中,在没有RANSAC的情况下,在误差阈值为5°时比最先进方法提高了11.7%。