Li Jinyu, Wang Yang, Wang Fei, Zhang Ran, Wang Ning, Zhu Yue, Zhao Taihong
The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China.
Nanjing Medical University, Nanjing, China.
Depress Anxiety. 2025 Jun 16;2025:5734107. doi: 10.1155/da/5734107. eCollection 2025.
Current assessments of adolescent emotional and behavioral problems rely heavily on subjective reports, which are prone to biases. This study is the first to explore the potential of speech signals as objective markers for predicting emotional and behavioral problems (hyperactivity, emotional symptoms, conduct problems, and peer problems) in adolescents using machine learning techniques. We analyzed speech data from 8215 adolescents aged 12-18 years, extracting four categories of speech features: mel-frequency cepstral coefficients (MFCC), mel energy spectrum (MELS), prosodic features (PROS), and formant features (FORM). Machine learning models-logistic regression (LR), support vector machine (SVM), and gradient boosting decision trees (GBDT)-were employed to classify hyperactivity, emotional symptoms, conduct problems, and peer problems as defined by the Strengths and Difficulties Questionnaire (SDQ). Model performance was assessed using area under the curve (AUC), F1-score, and Shapley additive explanations (SHAP) values. The GBDT model achieved the highest accuracy for predicting hyperactivity (AUC = 0.78) and emotional symptoms (AUC = 0.74 for males and 0.66 for females), while performance was weaker for conduct and peer problems. SHAP analysis revealed gender-specific feature importance patterns, with certain speech features being more critical for males than females. These findings demonstrate the feasibility of using speech features to objectively predict emotional and behavioral problems in adolescents and identify gender-specific markers. This study lays the foundation for developing speech-based assessment tools for early identification and intervention, offering an objective alternative to traditional subjective evaluation methods.
目前对青少年情绪和行为问题的评估严重依赖主观报告,而主观报告容易产生偏差。本研究首次探索了语音信号作为客观指标的潜力,通过机器学习技术预测青少年的情绪和行为问题(多动、情绪症状、品行问题和同伴问题)。我们分析了8215名12至18岁青少年的语音数据,提取了四类语音特征:梅尔频率倒谱系数(MFCC)、梅尔能量谱(MELS)、韵律特征(PROS)和共振峰特征(FORM)。使用机器学习模型——逻辑回归(LR)、支持向量机(SVM)和梯度提升决策树(GBDT)——对《优势与困难问卷》(SDQ)定义的多动、情绪症状、品行问题和同伴问题进行分类。使用曲线下面积(AUC)、F1分数和夏普利附加解释(SHAP)值评估模型性能。GBDT模型在预测多动(AUC = 0.78)和情绪症状(男性AUC = 0.74,女性AUC = 0.66)方面达到了最高准确率,而在品行和同伴问题方面表现较弱。SHAP分析揭示了性别特异性的特征重要性模式,某些语音特征对男性比女性更关键。这些发现证明了使用语音特征客观预测青少年情绪和行为问题并识别性别特异性指标的可行性。本研究为开发基于语音的评估工具以进行早期识别和干预奠定了基础,为传统主观评估方法提供了一种客观替代方案。