Chen Hongxu, Wang Denglan, Shen Juanjuan, Guo Baoyan, Song Chun, Ma Duo, Wu Yan, Liu Guohui, Chen Guangxue, Ni Yan, Kong Tiantian, Wang Fan
School of Public Health, Xinjiang Medical University, Urumqi, 830063, China.
Xinjiang Key Laboratory of Neurological Disorder Research, the Second Affiliated Hospital of Xinjiang Medical University, Urumqi, 830063, China.
BMC Pregnancy Childbirth. 2025 May 8;25(1):544. doi: 10.1186/s12884-025-07656-3.
Traditional statistical methods have dominated research on peripartum depression (PPD), but innovative approaches may provide deeper insights. This study aims to predict the impact factors of PPD using elastic net regression (ENR) combined with machine learning (ML) model.
This longitudinal study was conducted from June 2020 to May 2023, involving healthy pregnant women in the first trimester, followed up until the completion of the assessment in the second trimester. PPD symptoms were assessed using the Edinburgh Postnatal Depression Scale (EPDS). Features with p <.05 from logistic regression were selected and refined using ENR. These features were then used to build six ML models to identify the best-performing one. SHapley Additive exPlanations (SHAP) analysis was employed to enhance model interpretability by visualizing its decision-making process.
A total of 608 participants were followed, resulting in 384 valid questionnaires. After excluding incomplete or incorrect baseline data, 325 participants were ultimately included in the study. Among these, 130 were classified as having mild depression, and 32 were classified with major depression. Nineteen features were initially identified as being associated with PPD, with 14 retained after ENR refinement. The random forest (RF) model outperformed the other ML models. SHAP analysis identified the top five predictors of PPD: magnesium (Mg), remnant cholesterol (RC), calcium (Ca), mean corpuscular hemoglobin concentration (MCHc), and potassium (K). Mg, Ca, MCHc, and K were negatively correlated with PPD, while RC showed a positive correlation.
The RF model effectively identified associations between exposure factors and PPD. Mg, Ca, MCHc, and K were found to be protective factors, while RC emerged as a potential risk factor, highlighting its potential as a novel biomarker for PPD.
传统统计方法在围产期抑郁症(PPD)研究中占据主导地位,但创新方法可能会提供更深入的见解。本研究旨在使用弹性网络回归(ENR)结合机器学习(ML)模型预测PPD的影响因素。
本纵向研究于2020年6月至2023年5月进行,纳入孕早期健康孕妇,随访至孕中期评估结束。使用爱丁堡产后抑郁量表(EPDS)评估PPD症状。从逻辑回归中选择p <0.05的特征,并使用ENR进行优化。然后使用这些特征构建六个ML模型,以确定表现最佳的模型。采用SHapley加法解释(SHAP)分析,通过可视化其决策过程来提高模型的可解释性。
共随访608名参与者,得到384份有效问卷。排除不完整或不正确的基线数据后,最终纳入325名参与者进行研究。其中,130名被分类为轻度抑郁,32名被分类为重度抑郁。最初确定有19个特征与PPD相关,经ENR优化后保留14个。随机森林(RF)模型优于其他ML模型。SHAP分析确定了PPD的前五个预测因素:镁(Mg)、残余胆固醇(RC)、钙(Ca)、平均红细胞血红蛋白浓度(MCHc)和钾(K)。Mg、Ca、MCHc和K与PPD呈负相关,而RC呈正相关。
RF模型有效地识别了暴露因素与PPD之间的关联。发现Mg、Ca、MCHc和K是保护因素,而RC是潜在风险因素,突出了其作为PPD新型生物标志物的潜力。