Liu Ke, Yang Tao, Yu Zhuliang, Yi Weibo, Yu Hong, Wang Guoyin, Wu Wei
IEEE J Biomed Health Inform. 2024 Dec;28(12):7126-7137. doi: 10.1109/JBHI.2024.3450753. Epub 2024 Dec 5.
Transformer-based neural networks have been applied to the electroencephalography (EEG) decoding for motor imagery (MI). However, most networks focus on applying the self-attention mechanism to extract global temporal information, while the cross-frequency coupling features between different frequencies have been neglected. Additionally, effectively integrating different neural networks poses challenges for the advanced design of decoding algorithms.
This study proposes a novel end-to-end Multi-Scale Vision Transformer Neural Network (MSVTNet) for MI-EEG classification. MSVTNet first extracts local spatio-temporal features at different filtered scales through convolutional neural networks (CNNs). Then, these features are concatenated along the feature dimension to form local multi-scale spatio-temporal feature tokens. Finally, Transformers are utilized to capture cross-scale interaction information and global temporal correlations, providing more distinguishable feature embeddings for classification. Moreover, auxiliary branch loss is leveraged for intermediate supervision to ensure the effective integration of CNNs and Transformers.
The performance of MSVTNet was assessed through subject-dependent (session-dependent and session-independent) and subject-independent experiments on three MI datasets, i.e., the BCI competition IV 2a, 2b and OpenBMI datasets. The experimental results demonstrate that MSVTNet achieves state-of-the-art performance in all analyses.
MSVTNet shows superiority and robustness in enhancing MI decoding performance.
基于Transformer的神经网络已应用于运动想象(MI)的脑电图(EEG)解码。然而,大多数网络专注于应用自注意力机制来提取全局时间信息,而不同频率之间的交叉频率耦合特征却被忽视了。此外,有效整合不同的神经网络对解码算法的高级设计提出了挑战。
本研究提出了一种用于MI-EEG分类的新型端到端多尺度视觉Transformer神经网络(MSVTNet)。MSVTNet首先通过卷积神经网络(CNN)在不同滤波尺度上提取局部时空特征。然后,这些特征沿特征维度连接,形成局部多尺度时空特征令牌。最后,利用Transformer来捕获跨尺度交互信息和全局时间相关性,为分类提供更具区分性的特征嵌入。此外,利用辅助分支损失进行中间监督,以确保CNN和Transformer的有效整合。
通过在三个MI数据集(即BCI竞赛IV 2a、2b和OpenBMI数据集)上进行的依赖受试者(依赖会话和独立于会话)和独立于受试者的实验,评估了MSVTNet的性能。实验结果表明,MSVTNet在所有分析中均实现了最优性能。
MSVTNet在提高MI解码性能方面表现出优越性和鲁棒性。