Liang Lijuan, Wang Yang, Ma Hui, Zhang Ran, Liu Rongxun, Zhu Rongxin, Zheng Zhiguo, Zhang Xizhe, Wang Fei
Laboratory of Psychology, The First Affiliated Hospital of Hainan Medical University, Haikou, Hainan, China.
Early Intervention Unit, Department of Psychiatry, Affiliated Nanjing Brain Hospital, Nanjing Medical University, Nanjing, China.
Front Psychiatry. 2024 Sep 17;15:1422020. doi: 10.3389/fpsyt.2024.1422020. eCollection 2024.
Previous studies have classified major depression and healthy control groups based on vocal acoustic features, but the classification accuracy needs to be improved. Therefore, this study utilized deep learning methods to construct classification and prediction models for major depression and healthy control groups.
120 participants aged 16-25 participated in this study, included 64 MDD group and 56 HC group. We used the Covarep open-source algorithm to extract a total of 1200 high-level statistical functions for each sample. In addition, we used Python for correlation analysis, and neural network to establish the model to distinguish whether participants experienced depression, predict the total depression score, and evaluate the effectiveness of the classification and prediction model.
The classification modelling of the major depression and the healthy control groups by relevant and significant vocal acoustic features was 0.90, and the Receiver Operating Characteristic (ROC) curves analysis results showed that the classification accuracy was 84.16%, the sensitivity was 95.38%, and the specificity was 70.9%. The depression prediction model of speech characteristics showed that the predicted score was closely related to the total score of 17 items of the Hamilton Depression Scale(HAMD-17) (r=0.687, P<0.01); and the Mean Absolute Error(MAE) between the model's predicted score and total HAMD-17 score was 4.51.
This study's results may have been influenced by anxiety comorbidities.
The vocal acoustic features can not only effectively classify the major depression and the healthy control groups, but also accurately predict the severity of depressive symptoms.
以往的研究已根据嗓音声学特征对重度抑郁症患者和健康对照组进行分类,但分类准确性有待提高。因此,本研究利用深度学习方法构建重度抑郁症患者和健康对照组的分类及预测模型。
120名年龄在16至25岁之间的参与者参与了本研究,其中包括64名重度抑郁症组患者和56名健康对照组参与者。我们使用Covarep开源算法为每个样本提取总共1200个高级统计函数。此外,我们使用Python进行相关性分析,并利用神经网络建立模型,以区分参与者是否患有抑郁症、预测抑郁总分,并评估分类及预测模型的有效性。
通过相关且显著的嗓音声学特征对重度抑郁症患者和健康对照组进行分类建模的准确率为0.90,受试者工作特征(ROC)曲线分析结果显示,分类准确率为84.16%,敏感性为95.38%,特异性为70.9%。语音特征的抑郁预测模型显示,预测分数与汉密尔顿抑郁量表(HAMD-17)17项总分密切相关(r=0.687,P<0.01);模型预测分数与HAMD-17总分之间的平均绝对误差(MAE)为4.51。
本研究结果可能受到共病焦虑的影响。
嗓音声学特征不仅可以有效区分重度抑郁症患者和健康对照组,还能准确预测抑郁症状的严重程度。