Suppr超能文献

MCFA:用于跨视图地理定位的多尺度级联与特征自适应对齐网络

MCFA: Multi-Scale Cascade and Feature Adaptive Alignment Network for Cross-View Geo-Localization.

作者信息

Hou Kaiji, Tong Qiang, Yan Na, Liu Xiulei, Hou Shoulu

机构信息

College of Computer Science, Beijing Information Science and Technology University, Beijing 102206, China.

出版信息

Sensors (Basel). 2025 Jul 21;25(14):4519. doi: 10.3390/s25144519.

Abstract

Cross-view geo-localization (CVGL) presents significant challenges due to the drastic variations in perspective and scene layout between unmanned aerial vehicle (UAV) and satellite images. Existing methods have made certain advancements in extracting local features from images. However, they exhibit limitations in modeling the interactions among local features and fall short in aligning cross-view representations accurately. To address these issues, we propose a Multi-Scale Cascade and Feature Adaptive Alignment (MCFA) network, which consists of a Multi-Scale Cascade Module (MSCM) and a Feature Adaptive Alignment Module (FAAM). The MSCM captures the features of the target's adjacent regions and enhances the model's robustness by learning key region information through association and fusion. The FAAM, with its dynamically weighted feature alignment module, adaptively adjusts feature differences across different viewpoints, achieving feature alignment between drone and satellite images. Our method achieves state-of-the-art (SOTA) performance on two public datasets, University-1652 and SUES-200. In generalization experiments, our model outperforms existing SOTA methods, with an average improvement of 1.52% in R@1 and 2.09% in AP, demonstrating its effectiveness and strong generalization in cross-view geo-localization tasks.

摘要

跨视角地理定位(CVGL)由于无人机(UAV)图像和卫星图像之间视角和场景布局的巨大差异而面临重大挑战。现有方法在从图像中提取局部特征方面取得了一定进展。然而,它们在对局部特征之间的相互作用进行建模时存在局限性,并且在准确对齐跨视角表示方面存在不足。为了解决这些问题,我们提出了一种多尺度级联和特征自适应对齐(MCFA)网络,它由一个多尺度级联模块(MSCM)和一个特征自适应对齐模块(FAAM)组成。MSCM捕获目标相邻区域的特征,并通过关联和融合学习关键区域信息来增强模型的鲁棒性。FAAM及其动态加权特征对齐模块可自适应调整不同视角之间的特征差异,实现无人机图像和卫星图像之间的特征对齐。我们的方法在两个公共数据集University-1652和SUES-200上取得了领先的(SOTA)性能。在泛化实验中,我们的模型优于现有的SOTA方法,在R@1上平均提高了1.52%,在AP上提高了2.09%,证明了其在跨视角地理定位任务中的有效性和强大的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe35/12299452/1ddab3a66c8e/sensors-25-04519-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验