Kawakita Tetsuya, Martins Juliana G, Diab Yara H, Nehme Lea, Saade George
Department of Obstetrics and Gynecology, Macon and Joan Brock Virginia Health Sciences at Old Dominion University (ODU), Norfolk, Virginia.
Am J Perinatol. 2024 Dec 24. doi: 10.1055/a-2495-3600.
This study aimed to develop machine learning (ML) models for predicting preterm preeclampsia using the information available before 23 weeks gestation.
This was a secondary analysis of the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) cohort. We considered 131 features available before 23 weeks including maternal demographics, obstetrics and family history, social determinants of health, physical activity, nutrition, and early second-trimester ultrasound. Our primary outcome was preterm preeclampsia before 37 weeks. The dataset was randomly split into a training set (70%) and a validation set (30%). ML models using glmnet, multilayer perceptron, random forest, XGBoost (extreme gradient boosting), and LightGBM models were developed. Using the ML approach that achieved the best area under the curve (AUC), we developed the final model. Further feature selection was conducted from the top 25 important features based on SHapley Additive exPlanations (SHAP) values. The performance of the final model was assessed using the validation dataset.
Of 9,467 individuals, 219 (2.3%) had preterm preeclampsia. The AUC of the XGBoost model was the highest (AUC = 0.749 [95% confidence interval (95% CI), 0.736-0.762]) compared with other models. Therefore, XGBoost was used to develop models using fewer variables. The XGBoost model with the eight features (in order of importance: mean uterine artery pulsatility index in the early second trimester, chronic hypertension, pregestational diabetes, uterine artery notch, systolic and diastolic blood pressure in the first trimester, body mass index, and maternal age) was chosen as the final model as it had an AUC of 0.741 (95% CI, 0.730-0.752) which was not inferior to the original model ( = 0.58). The final model in the validation dataset had an AUC of 0.779 (95% CI, 0.722-0.831). An online application of the final model was developed ( https://kawakita.shinyapps.io/Preterm_preeclampsia/ ).
ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks.
· Prediction models using uterine artery Doppler have not been adopted in the US.. · We developed a model using an ML algorithm.. · An online application of the final model was developed.. · ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks..
本研究旨在利用妊娠23周前可得的信息开发用于预测早发型子痫前期的机器学习(ML)模型。
这是对初产妊娠结局研究:监测准妈妈(nuMoM2b)队列的二次分析。我们考虑了妊娠23周前可得的131个特征,包括孕产妇人口统计学特征、产科和家族史、健康的社会决定因素、身体活动、营养以及孕中期早期超声检查结果。我们的主要结局是37周前的早发型子痫前期。将数据集随机分为训练集(70%)和验证集(30%)。开发了使用广义线性模型网络(glmnet)、多层感知器、随机森林、XGBoost(极端梯度提升)和LightGBM模型的ML模型。使用曲线下面积(AUC)最佳的ML方法,我们开发了最终模型。基于夏普利值(SHapley Additive exPlanations,SHAP)从最重要的25个特征中进一步进行特征选择。使用验证数据集评估最终模型的性能。
在9467名个体中,219名(2.3%)患有早发型子痫前期。与其他模型相比,XGBoost模型的AUC最高(AUC = 0.749 [95%置信区间(95%CI),0.736 - 0.762])。因此,使用XGBoost开发变量较少的模型。选择具有八个特征(按重要性排序:孕中期早期子宫动脉搏动指数均值、慢性高血压、孕前糖尿病、子宫动脉切迹、孕早期收缩压和舒张压、体重指数以及孕产妇年龄)的XGBoost模型作为最终模型,因为其AUC为0.741(95%CI,0.730 - 0.752),不低于原始模型(= 0.58)。验证数据集中最终模型的AUC为0.779(95%CI,0.722 - 0.831)。开发了最终模型的在线应用程序(https://kawakita.shinyapps.io/Preterm_preeclampsia/)。
使用妊娠23周前可得信息的ML算法能够准确预测37周前的早发型子痫前期。
· 美国尚未采用使用子宫动脉多普勒的预测模型。· 我们使用ML算法开发了一个模型。· 开发了最终模型的在线应用程序。· 使用妊娠23周前可得信息的ML算法能够准确预测37周前的早发型子痫前期。