Geravanchizadeh Masoud, Shaygan Asl Amir, Danishvar Sebelan
Faculty of Electrical & Computer Engineering, University of Tabriz, Tabriz 51666-15813, Iran.
College of Engineering, Design and Physical Sciences, Brunel University London, London UB8 3PH, UK.
Bioengineering (Basel). 2024 Nov 30;11(12):1216. doi: 10.3390/bioengineering11121216.
Attention is one of many human cognitive functions that are essential in everyday life. Given our limited processing capacity, attention helps us focus only on what matters. Focusing attention on one speaker in an environment with many speakers is a critical ability of the human auditory system. This paper proposes a new end-to-end method based on the combined transformer and graph convolutional neural network (TraGCNN) that can effectively detect auditory attention from electroencephalograms (EEGs). This approach eliminates the need for manual feature extraction, which is often time-consuming and subjective. Here, the first EEG signals are converted to graphs. We then extract attention information from these graphs using spatial and temporal approaches. Finally, our models are trained with these data. Our model can detect auditory attention in both the spatial and temporal domains. Here, the EEG input is first processed by transformer layers to obtain a sequential representation of EEG based on attention onsets. Then, a family of graph convolutional layers is used to find the most active electrodes using the spatial position of electrodes. Finally, the corresponding EEG features of active electrodes are fed into the graph attention layers to detect auditory attention. The Fuglsang 2020 dataset is used in the experiments to train and test the proposed and baseline systems. The new TraGCNN approach, as compared with state-of-the-art attention classification methods from the literature, yields the highest performance in terms of accuracy (80.12%) as a classification metric. Additionally, the proposed model results in higher performance than our previously graph-based model for different lengths of EEG segments. The new TraGCNN approach is advantageous because attenuation detection is achieved from EEG signals of subjects without requiring speech stimuli, as is the case with conventional auditory attention detection methods. Furthermore, examining the proposed model for different lengths of EEG segments shows that the model is faster than our previous graph-based detection method in terms of computational complexity. The findings of this study have important implications for the understanding and assessment of auditory attention, which is crucial for many applications, such as brain-computer interface (BCI) systems, speech separation, and neuro-steered hearing aid development.
注意力是人类众多认知功能之一,在日常生活中至关重要。鉴于我们有限的处理能力,注意力帮助我们只专注于重要的事情。在有许多说话者的环境中,将注意力集中在一个说话者身上是人类听觉系统的一项关键能力。本文提出了一种基于组合变压器和图卷积神经网络(TraGCNN)的新的端到端方法,该方法可以有效地从脑电图(EEG)中检测听觉注意力。这种方法无需手动特征提取,而手动特征提取通常既耗时又主观。在此,首先将EEG信号转换为图。然后,我们使用空间和时间方法从这些图中提取注意力信息。最后,用这些数据训练我们的模型。我们的模型可以在空间和时间域中检测听觉注意力。在此,EEG输入首先由变压器层进行处理,以基于注意力起始点获得EEG的序列表示。然后,使用一族图卷积层根据电极的空间位置找到最活跃的电极。最后,将活跃电极的相应EEG特征输入到图注意力层中以检测听觉注意力。实验中使用Fuglsang 2020数据集来训练和测试所提出的系统和基线系统。与文献中最先进的注意力分类方法相比,新的TraGCNN方法在作为分类指标的准确率(80.12%)方面产生了最高的性能。此外,对于不同长度的EEG段,所提出的模型比我们之前基于图的模型具有更高的性能。新的TraGCNN方法具有优势,因为与传统听觉注意力检测方法不同,它无需语音刺激即可从受试者的EEG信号中实现衰减检测。此外,对不同长度的EEG段检查所提出的模型表明,该模型在计算复杂度方面比我们之前基于图的检测方法更快。本研究的结果对于理解和评估听觉注意力具有重要意义,而听觉注意力对于许多应用至关重要,例如脑机接口(BCI)系统、语音分离和神经导向助听器的开发。