Hu Xiuhua, Zhao Jing, Hui Yan, Li Shuang, You Shijie
School of Computer Science and Engineering, Xi'an Technological University, Xi'an 710021, China.
State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi'an 710021, China.
Sensors (Basel). 2023 Oct 24;23(21):8666. doi: 10.3390/s23218666.
Due to high maneuverability as well as hardware limitations of Unmanned Aerial Vehicle (UAV) platforms, tracking targets in UAV views often encounter challenges such as low resolution, fast motion, and background interference, which make it difficult to strike a compatibility between performance and efficiency. Based on the Siamese network framework, this paper proposes a novel UAV tracking algorithm, SiamHSFT, aiming to achieve a balance between tracking robustness and real-time computation. Firstly, by combining CBAM attention and downward information interaction in the feature enhancement module, the provided method merges high-level and low-level feature maps to prevent the loss of information when dealing with small targets. Secondly, it focuses on both long and short spatial intervals within the affinity in the interlaced sparse attention module, thereby enhancing the utilization of global context and prioritizing crucial information in feature extraction. Lastly, the Transformer's encoder is optimized with a modulation enhancement layer, which integrates triplet attention to enhance inter-layer dependencies and improve target discrimination. Experimental results demonstrate SiamHSFT's excellent performance across diverse datasets, including UAV123, UAV20L, UAV123@10fps, and DTB70. Notably, it performs better in fast motion and dynamic blurring scenarios. Meanwhile, it maintains an average tracking speed of 126.7 fps across all datasets, meeting real-time tracking requirements.
由于无人机(UAV)平台的高机动性以及硬件限制,在无人机视角下跟踪目标常常面临诸如低分辨率、快速运动和背景干扰等挑战,这使得在性能和效率之间难以达成兼容性。基于暹罗网络框架,本文提出了一种新颖的无人机跟踪算法SiamHSFT,旨在实现跟踪鲁棒性和实时计算之间的平衡。首先,通过在特征增强模块中结合CBAM注意力和向下信息交互,该方法合并高级和低级特征图,以防止在处理小目标时信息丢失。其次,它在交错稀疏注意力模块中关注亲和度内的长空间间隔和短空间间隔,从而提高全局上下文的利用率,并在特征提取中优先处理关键信息。最后,使用调制增强层对Transformer的编码器进行优化,该层集成了三重注意力以增强层间依赖性并提高目标辨别能力。实验结果表明,SiamHSFT在包括UAV123、UAV20L、UAV123@10fps和DTB70在内的各种数据集上均表现出色。值得注意的是,它在快速运动和动态模糊场景中表现更佳。同时,它在所有数据集上保持平均跟踪速度为126.7 fps,满足实时跟踪要求。