Eberhard Braden W, Cohen Raphael Y, Rigoni John, Bates David W, Gray Kathryn J, Kovacheva Vesela P
medRxiv. 2023 Aug 16:2023.08.16.23293946. doi: 10.1101/2023.08.16.23293946.
Preeclampsia is a pregnancy-specific disease characterized by new onset hypertension after 20 weeks of gestation that affects 2-8% of all pregnancies and contributes to up to 26% of maternal deaths. Despite extensive clinical research, current predictive tools fail to identify up to 66% of patients who will develop preeclampsia. We sought to develop a tool to longitudinally predict preeclampsia risk.
In this retrospective model development and validation study, we examined a large cohort of patients who delivered at six community and two tertiary care hospitals in the New England region between 02/2015 and 06/2023. We used sociodemographic, clinical diagnoses, family history, laboratory, and vital signs data. We developed eight datasets at 14, 20, 24, 28, 32, 36, 39 weeks gestation and at the hospital admission for delivery. We created linear regression, random forest, xgboost, and deep neural networks to develop multiple models and compared their performance. We used Shapley values to investigate the global and local explainability of the models and the relationships between the predictive variables.
Our study population (N=120,752) had an incidence of preeclampsia of 5.7% (N=6,920). The performance of the models as measured using the area under the curve, AUC, was in the range 0.73-0.91, which was externally validated. The relationships between some of the variables were complex and non-linear; in addition, the relative significance of the predictors varied over the pregnancy. Compared to the current standard of care for preeclampsia risk stratification in the first trimester, our model would allow 48.6% more at-risk patients to be identified.
Our novel preeclampsia prediction tool would allow clinicians to identify patients at risk early and provide personalized predictions, as well as longitudinal predictions throughout pregnancy.
National Institutes of Health, Anesthesia Patient Safety Foundation.
Current tools for the prediction of preeclampsia are lacking as they fail to identify up to 66% of the patients who develop preeclampsia. We searched PubMed, MEDLINE, and the Web of Science from database inception to May 1, 2023, using the keywords "deep learning", "machine learning", "preeclampsia", "artificial intelligence", "pregnancy complications", and "predictive models". We identified 13 studies that employed machine learning to develop prediction models for preeclampsia risk based on clinical variables. Among these studies, six included biomarkers such as serum placental growth factor, pregnancy-associated plasma protein A, and uterine artery pulsatility index, which are not routinely available in our clinical practice; two studies were in diverse cohorts of more than 100 000 patients, and two studies developed longitudinal predictions using medical records data. However, most studies have limited depth, concerns about data leakage, overfitting, or lack of generalizability. We developed a comprehensive longitudinal predictive tool based on routine clinical data that can be used throughout pregnancy to predict the risk of preeclampsia. We tested multiple types of predictive models, including machine learning and deep learning models, and demonstrated high predictive power. We investigated the changes over different time points of individual and group variables and found previously known and novel relationships between variables such as red blood cell count and preeclampsia risk. Longitudinal prediction of preeclampsia using machine learning can be achieved with high performance. Implementation of an accurate predictive tool within the electronic health records can aid clinical care and identify patients at heightened risk who would benefit from aspirin prophylaxis, increased surveillance, early diagnosis, and escalation in care. These results highlight the potential of using artificial intelligence in clinical decision support, with the ultimate goal of reducing iatrogenic preterm birth and improving perinatal care.
子痫前期是一种妊娠特有的疾病,其特征为妊娠20周后新发高血压,影响所有妊娠的2 - 8%,并导致高达26%的孕产妇死亡。尽管进行了广泛的临床研究,但目前的预测工具仍无法识别高达66%的会发生子痫前期的患者。我们试图开发一种工具来纵向预测子痫前期风险。
在这项回顾性模型开发与验证研究中,我们检查了2015年2月至2023年6月期间在新英格兰地区的六家社区医院和两家三级护理医院分娩的一大群患者。我们使用了社会人口统计学、临床诊断、家族史、实验室检查和生命体征数据。我们在妊娠14、20、24、28、32、36、39周以及入院分娩时创建了八个数据集。我们创建了线性回归、随机森林、xgboost和深度神经网络来开发多个模型,并比较它们的性能。我们使用Shapley值来研究模型的全局和局部可解释性以及预测变量之间的关系。
我们的研究人群(N = 120,752)子痫前期发病率为5.7%(N = 6,920)。使用曲线下面积(AUC)衡量的模型性能在0.73 - 0.91范围内,这在外部得到了验证。一些变量之间的关系复杂且非线性;此外,预测因子的相对重要性在整个孕期有所变化。与当前孕早期子痫前期风险分层的标准护理相比,我们的模型能够多识别48.6%的高危患者。
我们新颖的子痫前期预测工具将使临床医生能够早期识别高危患者,并提供个性化预测以及整个孕期的纵向预测。
美国国立卫生研究院、麻醉患者安全基金会。
目前用于预测子痫前期的工具存在不足,因为它们无法识别高达66%的会发生子痫前期的患者。我们使用关键词“深度学习”“机器学习”“子痫前期”“人工智能”“妊娠并发症”和“预测模型”,在数据库创建至2023年5月1日期间搜索了PubMed、MEDLINE和Web of Science。我们确定了13项使用机器学习基于临床变量开发子痫前期风险预测模型 的研究。在这些研究中,六项纳入了生物标志物,如血清胎盘生长因子、妊娠相关血浆蛋白A和子宫动脉搏动指数,这些在我们的临床实践中并非常规可得;两项研究涉及超过100,000名患者的不同队列,两项研究使用病历数据进行纵向预测。然而,大多数研究深度有限,存在数据泄露、过度拟合或缺乏可推广性的问题。我们基于常规临床数据开发了一种全面的纵向预测工具,可在整个孕期用于预测子痫前期风险。我们测试了多种类型的预测模型,包括机器学习和深度学习模型,并展示了高预测能力。我们研究了个体和组变量在不同时间点的变化,发现了红细胞计数与子痫前期风险等变量之间先前已知和新的关系。使用机器学习进行子痫前期的纵向预测可以实现高性能。在电子健康记录中实施准确的预测工具可以辅助临床护理,并识别出从阿司匹林预防、加强监测、早期诊断和护理升级中获益的高危患者。这些结果凸显了在临床决策支持中使用人工智能的潜力,其最终目标是减少医源性早产并改善围产期护理。