Hua Qing, Yang Fengchun, Zhou Yadan, Shi Fenglian, You Xiaoyan, Guo Jing, Li Li
Department of Obstetrics and Gynecology, Zhengzhou Central Hospital Affiliated to Zhengzhou University, Zhengzhou, China.
Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China.
J Med Internet Res. 2025 May 27;27:e70068. doi: 10.2196/70068.
Fetal growth restriction (FGR) is a common complication of preeclampsia. FGR in patients with preeclampsia increases the risk of neonatal-perinatal mortality and morbidity. However, previous prediction methods for FGR are class-biased or clinically unexplainable, which makes it difficult to apply to clinical practice, leading to a relative delay in intervention and a lack of effective treatments.
The study aims to develop an auxiliary diagnostic model based on machine learning (ML) to predict the occurrence of FGR in patients with preeclampsia.
This study used a retrospective case-control approach to analyze 38 features, including the basic medical history and peripheral blood laboratory test results of pregnant patients with preeclampsia, either complicated or not complicated by FGR. ML models were constructed to evaluate the predictive value of maternal parameter changes on preeclampsia combined with FGR. Multiple algorithms were tested, including logistic regression, light gradient boosting, random forest (RF), extreme gradient boosting, multilayer perceptron, naive Bayes, and support vector machine. The model performance was identified by the area under the curve (AUC) and other evaluation indexes. The Shapley additive explanations (SHAP) method was adopted to rank the feature importance and explain the final model for clinical application.
The RF model performed best in discriminative ability among the 7 ML models. After reducing features according to importance rank, an explainable final RF model was established with 9 features, including urinary protein quantification, gestational week of delivery, umbilical artery systolic-to-diastolic ratio, amniotic fluid index, triglyceride, D-dimer, weight, height, and maximum systolic pressure. The model could accurately predict FGR for 513 patients with preeclampsia (149 with FGR and 364 without FGR) in the training and testing dataset (AUC 0.83, SD 0.03) using 5-fold cross-validation, which was closely validated for 103 patients with preeclampsia (n=45 with FGR and n=58 without FGR) in an external dataset (AUC 0.82, SD 0.048). On the whole, urinary protein quantification, umbilical artery systolic-to-diastolic ratio, and gestational week of delivery exhibited the highest contributions to the model performance (c=0.45, 0.34, and 0.33) based on SHAP analysis. For specific individual patients, SHAP results reveal the protective and risk factors to develop FGR for interpreting the model's clinical significance. Finally, the model has been translated into a convenient web page tool to facilitate its use in clinical settings.
The study successfully developed a model that accurately predicts FGR development in patients with preeclampsia. The SHAP method captures highly relevant risk factors for model interpretation, alleviating concerns about the "black box" problem of ML techniques.
胎儿生长受限(FGR)是子痫前期的常见并发症。子痫前期患者发生FGR会增加新生儿围产期死亡和发病风险。然而,既往FGR预测方法存在类别偏差或临床难以解释的问题,难以应用于临床实践,导致干预相对延迟且缺乏有效治疗手段。
本研究旨在开发一种基于机器学习(ML)的辅助诊断模型,以预测子痫前期患者发生FGR的情况。
本研究采用回顾性病例对照方法,分析38项特征,包括子痫前期合并或未合并FGR的孕妇的基本病史和外周血实验室检查结果。构建ML模型以评估母体参数变化对子痫前期合并FGR的预测价值。测试了多种算法,包括逻辑回归、轻梯度提升、随机森林(RF)、极端梯度提升、多层感知器、朴素贝叶斯和支持向量机。通过曲线下面积(AUC)和其他评估指标来确定模型性能。采用Shapley值加法解释(SHAP)方法对特征重要性进行排序,并解释最终模型以供临床应用。
RF模型在7种ML模型中判别能力表现最佳。根据重要性排名减少特征后,建立了一个具有9项特征的可解释最终RF模型,包括尿蛋白定量、分娩孕周、脐动脉收缩压与舒张压比值、羊水指数、甘油三酯、D - 二聚体、体重、身高和最高收缩压。该模型在训练和测试数据集中使用5折交叉验证,能够准确预测513例子痫前期患者(149例有FGR,364例无FGR)是否发生FGR(AUC 0.83,标准差0.03),在外部数据集中对103例子痫前期患者(45例有FGR,58例无FGR)进行了紧密验证(AUC 0.82,标准差0.048)。总体而言,基于SHAP分析,尿蛋白定量、脐动脉收缩压与舒张压比值和分娩孕周对模型性能的贡献最高(c = 0.45、0.34和0.33)。对于特定个体患者,SHAP结果揭示了发生FGR的保护因素和风险因素,以解释模型的临床意义。最后,该模型已转化为便捷的网页工具,便于在临床环境中使用。
本研究成功开发了一种能够准确预测子痫前期患者FGR发生情况的模型。SHAP方法捕捉了与模型解释高度相关的风险因素,减轻了对ML技术“黑箱”问题的担忧。