Pan Hongguang, Wang Yiran, Li Zhuoyi, Chu Xin, Teng Bingyang, Gao Hongzheng
IEEE Trans Biomed Eng. 2024 Aug;71(8):2454-2462. doi: 10.1109/TBME.2024.3376603. Epub 2024 Jul 18.
Some classification studies of brain-computer interface (BCI) based on speech imagery show potential for improving communication skills in patients with amyotrophic lateral sclerosis (ALS). However, current research on speech imagery is limited in scope and primarily focuses on vowels or a few selected words. In this paper, we propose a complete research scheme for multi-character classification based on EEG signals derived from speech imagery. Firstly, we record 31 speech imagery contents, including 26 alphabets and five commonly used punctuation marks, from seven subjects using a 32-channel electroencephalogram (EEG) device. Secondly, we introduce the wavelet scattering transform (WST), which shares a structural resemblance to Convolutional Neural Networks (CNNs), for feature extraction. The WST is a knowledge-driven technique that preserves high-frequency information and maintains the deformation stability of EEG signals. To reduce the dimensionality of wavelet scattering coefficient features, we employ Kernel Principal Component Analysis (KPCA). Finally, the reduced features are fed into an Extreme Gradient Boosting (XGBoost) classifier within a multi-classification framework. The XGBoost classifier is optimized through hyperparameter tuning using grid search and 10-fold cross-validation, resulting in an average accuracy of 78.73% for the multi-character classification task. We utilize t-Distributed Stochastic Neighbor Embedding (t-SNE) technology to visualize the low-dimensional representation of multi-character speech imagery. This visualization effectively enables us to observe the clustering of similar characters. The experimental results demonstrate the effectiveness of our proposed multi-character classification scheme. Furthermore, our classification categories and accuracy exceed those reported in existing research.
一些基于语音意象的脑机接口(BCI)分类研究显示出改善肌萎缩侧索硬化症(ALS)患者沟通能力的潜力。然而,目前关于语音意象的研究范围有限,主要集中在元音或少数几个选定的单词上。在本文中,我们提出了一种基于从语音意象中提取的脑电信号进行多字符分类的完整研究方案。首先,我们使用32通道脑电图(EEG)设备从7名受试者那里记录了31个语音意象内容,包括26个字母和5个常用标点符号。其次,我们引入了与卷积神经网络(CNN)结构相似的小波散射变换(WST)进行特征提取。WST是一种知识驱动技术,可保留高频信息并保持脑电信号的变形稳定性。为了降低小波散射系数特征的维度,我们采用核主成分分析(KPCA)。最后,将降维后的特征输入到多分类框架内的极端梯度提升(XGBoost)分类器中。通过使用网格搜索和10折交叉验证对XGBoost分类器进行超参数调整,多字符分类任务的平均准确率达到了78.73%。我们利用t分布随机邻域嵌入(t-SNE)技术来可视化多字符语音意象的低维表示。这种可视化有效地使我们能够观察到相似字符的聚类情况。实验结果证明了我们提出的多字符分类方案的有效性。此外,我们的分类类别和准确率超过了现有研究报告的结果。