Luo Qian, Di Yazheng, Zhu Tingshao
Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
J Affect Disord. 2024 May 1;352:395-402. doi: 10.1016/j.jad.2024.02.021. Epub 2024 Feb 9.
Neuroticism's impact on psychopathological and physical health issues has significant public health implications. Multiple studies confirm its predictive effect on suicide risk among depressed patients. However, previous research lacks a standardized criterion for assessing neuroticism through speech, often relying on simple features (such as pitch, loudness and MFCCs). This study aims to improve upon this by extracting features using advanced pre-trained speaker embedding models (i-vector and x-vector extractors). Additionally, unlike prior studies utilizing general population data, we explore neuroticism prediction in depressed and non-depressed subgroups.
We collected edited discourse data from clinical interviews of 3580 depressed individuals and 4016 healthy individuals from the CONVERGE study. Instead of solely extracting Low-Level Acoustic Descriptors, we incorporated i-vector and x-vector features. We compared the performance of three different features in predicting neuroticism and explored their combination to enhance model accuracy.
The SVR model, combining three speech features with downscaled features to 300, exhibited the highest performance in predicting neuroticism scores. It achieved a coefficient of determination (R-squared) of 0.3 or higher and a correlation of 0.56 between predicted and actual values. The predictive classification accuracy of speech features for neuroticism in specific populations (healthy and depressed) exceeded 60 %.
This study included only women.
Combining diverse speech features enhances the predictive capacity of models using speech features to assess neuroticism, particularly in specific populations. This study lays the foundation for future exploration of speech features in neuroticism prediction.
神经质对心理病理和身体健康问题的影响具有重大的公共卫生意义。多项研究证实了其对抑郁症患者自杀风险的预测作用。然而,以往的研究缺乏通过言语评估神经质的标准化标准,通常依赖于简单的特征(如音高、响度和梅尔频率倒谱系数)。本研究旨在通过使用先进的预训练说话人嵌入模型(i-向量和x-向量提取器)提取特征来改进这一点。此外,与以往利用普通人群数据的研究不同,我们在抑郁症患者和非抑郁症患者亚组中探索神经质预测。
我们从CONVERGE研究中收集了3580名抑郁症患者和4016名健康个体的临床访谈编辑话语数据。我们不仅提取了低级声学描述符,还纳入了i-向量和x-向量特征。我们比较了三种不同特征在预测神经质方面的性能,并探索了它们的组合以提高模型准确性。
将三种语音特征与下采样至300的特征相结合的支持向量回归(SVR)模型在预测神经质得分方面表现出最高性能。它的决定系数(R平方)达到0.3或更高,预测值与实际值之间的相关性为0.56。语音特征对特定人群(健康和抑郁)神经质的预测分类准确率超过60%。
本研究仅纳入了女性。
结合多种语音特征可提高使用语音特征评估神经质的模型的预测能力,尤其是在特定人群中。本研究为未来探索语音特征在神经质预测中的应用奠定了基础。