School of Automation Science and Electrical Engineering, Beihang University, Beijing, China; Hangzhou Innovation Institute, Beihang University, Hangzhou, China.
School of Electrical Engineering and Automation, Anhui University, Hefei, China.
Comput Biol Med. 2024 Jun;175:108504. doi: 10.1016/j.compbiomed.2024.108504. Epub 2024 Apr 24.
Convolutional neural network (CNN) has been widely applied in motor imagery (MI)-based brain computer interface (BCI) to decode electroencephalography (EEG) signals. However, due to the limited perceptual field of convolutional kernel, CNN only extracts features from local region without considering long-term dependencies for EEG decoding. Apart from long-term dependencies, multi-modal temporal information is equally important for EEG decoding because it can offer a more comprehensive understanding of the temporal dynamics of neural processes. In this paper, we propose a novel deep learning network that combines CNN with self-attention mechanism to encapsulate multi-modal temporal information and global dependencies. The network first extracts multi-modal temporal information from two distinct perspectives: average and variance. A shared self-attention module is then designed to capture global dependencies along these two feature dimensions. We further design a convolutional encoder to explore the relationship between average-pooled and variance-pooled features and fuse them into more discriminative features. Moreover, a data augmentation method called signal segmentation and recombination is proposed to improve the generalization capability of the proposed network. The experimental results on the BCI Competition IV-2a (BCIC-IV-2a) and BCI Competition IV-2b (BCIC-IV-2b) datasets show that our proposed method outperforms the state-of-the-art methods and achieves 4-class average accuracy of 85.03% on the BCIC-IV-2a dataset. The proposed method implies the effectiveness of multi-modal temporal information fusion in attention-based deep learning networks and provides a new perspective for MI-EEG decoding. The code is available at https://github.com/Ma-Xinzhi/EEG-TransNet.
卷积神经网络(CNN)已广泛应用于基于运动想象(MI)的脑机接口(BCI),以解码脑电图(EEG)信号。然而,由于卷积核的感知域有限,CNN 只能从局部区域提取特征,而不考虑 EEG 解码的长期依赖关系。除了长期依赖关系外,多模态时间信息对于 EEG 解码同样重要,因为它可以提供对神经过程时间动态的更全面理解。在本文中,我们提出了一种新的深度学习网络,该网络将 CNN 与自注意力机制相结合,以封装多模态时间信息和全局依赖性。该网络首先从两个不同的角度提取多模态时间信息:平均值和方差。然后设计了一个共享的自注意力模块来捕获沿这两个特征维度的全局依赖性。我们进一步设计了一个卷积编码器来探索平均池化和方差池化特征之间的关系,并将它们融合为更具判别力的特征。此外,还提出了一种称为信号分段和重组的数据增强方法,以提高所提出网络的泛化能力。在 BCI 竞赛 IV-2a(BCIC-IV-2a)和 BCI 竞赛 IV-2b(BCIC-IV-2b)数据集上的实验结果表明,我们的方法优于最新方法,在 BCIC-IV-2a 数据集上达到了 4 类平均准确率 85.03%。该方法表明在基于注意力的深度学习网络中融合多模态时间信息的有效性,并为 MI-EEG 解码提供了新的视角。代码可在 https://github.com/Ma-Xinzhi/EEG-TransNet 上获得。