Yousufi Musyyab, Damaševičius Robertas, Maskeliūnas Rytis
Centre of Real Time Computer Systems, Kaunas University of Technology, 51368 Kaunas, Lithuania.
Brain Sci. 2024 Oct 15;14(10):1018. doi: 10.3390/brainsci14101018.
BACKGROUND/OBJECTIVES: This study investigates the classification of Major Depressive Disorder (MDD) using electroencephalography (EEG) Short-Time Fourier-Transform (STFT) spectrograms and audio Mel-spectrogram data of 52 subjects. The objective is to develop a multimodal classification model that integrates audio and EEG data to accurately identify depressive tendencies.
We utilized the Multimodal open dataset for Mental Disorder Analysis (MODMA) and trained a pre-trained Densenet121 model using transfer learning. Features from both the EEG and audio modalities were extracted and concatenated before being passed through the final classification layer. Additionally, an ablation study was conducted on both datasets separately.
The proposed multimodal classification model demonstrated superior performance compared to existing methods, achieving an Accuracy of 97.53%, Precision of 98.20%, F1 Score of 97.76%, and Recall of 97.32%. A confusion matrix was also used to evaluate the model's effectiveness.
The paper presents a robust multimodal classification approach that outperforms state-of-the-art methods with potential application in clinical diagnostics for depression assessment.
背景/目的:本研究使用52名受试者的脑电图(EEG)短时傅里叶变换(STFT)频谱图和音频梅尔频谱图数据,对重度抑郁症(MDD)进行分类。目的是开发一种多模态分类模型,该模型整合音频和脑电图数据以准确识别抑郁倾向。
我们利用了精神障碍分析多模态开放数据集(MODMA),并使用迁移学习训练了一个预训练的Densenet121模型。在通过最终分类层之前,提取并连接了脑电图和音频模态的特征。此外,还分别对两个数据集进行了消融研究。
与现有方法相比,所提出的多模态分类模型表现出卓越的性能,准确率达到97.53%,精确率为98.20%,F1分数为97.76%,召回率为97.32%。还使用混淆矩阵评估了模型的有效性。
本文提出了一种强大的多模态分类方法,该方法优于现有技术方法,在抑郁症评估的临床诊断中具有潜在应用价值。