Sheng Wenshun, Zheng Ziling, Zhu Hanzhi
Pujiang Institute, Nanjing Tech University, Nanjing, 211200, China.
Sci Rep. 2025 May 20;15(1):17539. doi: 10.1038/s41598-025-02757-7.
This paper proposes a non-contact palm vein image segmentation model that integrates multiscale convolution and Swin-Transformer. Based on an enhanced U-Net architecture, the downsampling path employs a multiscale convolution module to extract hierarchical features, while the upsampling path captures global vein distribution through a sliding window attention mechanism. A feature fusion module suppresses background interference by integrating cross-layer information. Experimental results demonstrate that the model achieves 97.8% accuracy and 94.5% Dice coefficient on the PolyU and CASIA datasets, with a 3.2% improvement over U-Net. Ablation studies validate the synergistic effectiveness of the proposed modules. The model effectively enhances the robustness of palm vein recognition in complex illumination and noisy environments.
本文提出了一种集成多尺度卷积和Swin-Transformer的非接触式掌静脉图像分割模型。基于增强的U-Net架构,下采样路径采用多尺度卷积模块来提取分层特征,而上采样路径通过滑动窗口注意力机制捕捉全局静脉分布。一个特征融合模块通过整合跨层信息来抑制背景干扰。实验结果表明,该模型在PolyU和CASIA数据集上实现了97.8%的准确率和94.5%的Dice系数,比U-Net提高了3.2%。消融研究验证了所提出模块的协同有效性。该模型有效地增强了在复杂光照和噪声环境下掌静脉识别的鲁棒性。