Suppr超能文献

FV-MViT:用于指静脉识别的移动视觉Transformer。

FV-MViT: Mobile Vision Transformer for Finger Vein Recognition.

机构信息

College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen 518060, China.

出版信息

Sensors (Basel). 2024 Feb 19;24(4):1331. doi: 10.3390/s24041331.

Abstract

In addressing challenges related to high parameter counts and limited training samples for finger vein recognition, we present the FV-MViT model. It serves as a lightweight deep learning solution, emphasizing high accuracy, portable design, and low latency. The FV-MViT introduces two key components. The Mul-MV2 Block utilizes a dual-path inverted residual connection structure for multi-scale convolutions, extracting additional local features. Simultaneously, the Enhanced MobileViT Block eliminates the large-scale convolution block at the beginning of the original MobileViT Block. It converts the Transformer's self-attention into separable self-attention with linear complexity, optimizing the back end of the original MobileViT Block with depth-wise separable convolutions. This aims to extract global features and effectively reduce parameter counts and feature extraction times. Additionally, we introduce a soft target center cross-entropy loss function to enhance generalization and increase accuracy. Experimental results indicate that the FV-MViT achieves a recognition accuracy of 99.53% and 100.00% on the Shandong University (SDU) and Universiti Teknologi Malaysia (USM) datasets, with equal error rates of 0.47% and 0.02%, respectively. The model has a parameter count of 5.26 million and exhibits a latency of 10.00 milliseconds from the sample input to the recognition output. Comparison with state-of-the-art (SOTA) methods reveals competitive performance for FV-MViT.

摘要

在解决手指静脉识别中高参数计数和有限训练样本相关的挑战时,我们提出了 FV-MViT 模型。它是一种轻量级深度学习解决方案,强调高精度、便携设计和低延迟。FV-MViT 引入了两个关键组件。Mul-MV2 块利用双路径倒置残差连接结构进行多尺度卷积,提取额外的局部特征。同时,增强型 MobileViT 块在原始 MobileViT 块的开头消除了大规模卷积块。它将 Transformer 的自注意力转换为具有线性复杂度的可分离自注意力,并用深度可分离卷积优化原始 MobileViT 块的后端。这旨在提取全局特征,并有效减少参数计数和特征提取次数。此外,我们引入了软目标中心交叉熵损失函数来增强泛化能力并提高准确性。实验结果表明,FV-MViT 在山东大学(SDU)和马来西亚工艺大学(USM)数据集上的识别准确率分别达到 99.53%和 100.00%,等错误率分别为 0.47%和 0.02%。该模型的参数计数为 526 万,从样本输入到识别输出的延迟为 10.00 毫秒。与最先进的方法(SOTA)进行比较,FV-MViT 表现出具有竞争力的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验