Pan Wei, Deng Fusong, Wang Xianbin, Hang Bowen, Zhou Wenwei, Zhu Tingshao
Key Laboratory of Adolescent Cyberpsychology and Behavior (CCNU), Ministry of Education, Wuhan, China.
School of Psychology, Central China Normal University, Wuhan, China.
Front Psychiatry. 2023 Jul 20;14:1079448. doi: 10.3389/fpsyt.2023.1079448. eCollection 2023.
Vocal features have been exploited to distinguish depression from healthy controls. While there have been some claims for success, the degree to which changes in vocal features are specific to depression has not been systematically studied. Hence, we examined the performances of vocal features in differentiating depression from bipolar disorder (BD), schizophrenia and healthy controls, as well as pairwise classifications for the three disorders.
We sampled 32 bipolar disorder patients, 106 depression patients, 114 healthy controls, and 20 schizophrenia patients. We extracted i-vectors from Mel-frequency cepstrum coefficients (MFCCs), and built logistic regression models with ridge regularization and 5-fold cross-validation on the training set, then applied models to the test set. There were seven classification tasks: any disorder versus healthy controls; depression versus healthy controls; BD versus healthy controls; schizophrenia versus healthy controls; depression versus BD; depression versus schizophrenia; BD versus schizophrenia.
The area under curve (AUC) score for classifying depression and bipolar disorder was 0.5 (- = 0.44). For other comparisons, the AUC scores ranged from 0.75 to 0.92, and the - ranged from 0.73 to 0.91. The model performance (AUC) of classifying depression and bipolar disorder was significantly worse than that of classifying bipolar disorder and schizophrenia (corrected < 0.05). While there were no significant differences in the remaining pairwise comparisons of the 7 classification tasks.
Vocal features showed discriminatory potential in classifying depression and the healthy controls, as well as between depression and other mental disorders. Future research should systematically examine the mechanisms of voice features in distinguishing depression with other mental disorders and develop more sophisticated machine learning models so that voice can assist clinical diagnosis better.
语音特征已被用于区分抑郁症患者与健康对照者。尽管已有一些成功的报道,但语音特征变化对抑郁症的特异性程度尚未得到系统研究。因此,我们研究了语音特征在区分抑郁症与双相情感障碍(BD)、精神分裂症及健康对照者方面的表现,以及这三种疾病之间的两两分类情况。
我们对32名双相情感障碍患者、106名抑郁症患者、114名健康对照者和20名精神分裂症患者进行了采样。我们从梅尔频率倒谱系数(MFCC)中提取i -向量,并在训练集上构建具有岭正则化和5折交叉验证的逻辑回归模型,然后将模型应用于测试集。共有七项分类任务:任何疾病与健康对照者;抑郁症与健康对照者;双相情感障碍与健康对照者;精神分裂症与健康对照者;抑郁症与双相情感障碍;抑郁症与精神分裂症;双相情感障碍与精神分裂症。
区分抑郁症和双相情感障碍的曲线下面积(AUC)得分为0.5(- = 0.44)。对于其他比较,AUC得分在0.75至0.92之间,- 在0.73至0.91之间。区分抑郁症和双相情感障碍的模型性能(AUC)显著低于区分双相情感障碍和精神分裂症的模型性能(校正后 < 0.05)。而在其余7项分类任务的两两比较中无显著差异。
语音特征在区分抑郁症与健康对照者以及抑郁症与其他精神障碍方面显示出判别潜力。未来的研究应系统地研究语音特征在区分抑郁症与其他精神障碍中的机制,并开发更复杂的机器学习模型,以便语音能更好地辅助临床诊断。