Departments of Obstetrics and Gynecology, University of North Carolina at Chapel Hill, Chapel Hill, Duke University, Durham, Wake Forest University, Winston-Salem, North Carolina.
Obstet Gynecol. 2020 Apr;135(4):935-944. doi: 10.1097/AOG.0000000000003759.
To predict a woman's risk of postpartum hemorrhage at labor admission using machine learning and statistical models.
Predictive models were constructed and compared using data from 10 of 12 sites in the U.S. Consortium for Safe Labor Study (2002-2008) that consistently reported estimated blood loss at delivery. The outcome was postpartum hemorrhage, defined as an estimated blood loss at least 1,000 mL. Fifty-five candidate risk factors routinely available on labor admission were considered. We used logistic regression with and without lasso regularization (lasso regression) as the two statistical models, and random forest and extreme gradient boosting as the two machine learning models to predict postpartum hemorrhage. Model performance was measured by C statistics (ie, concordance index), calibration, and decision curves. Models were constructed from the first phase (2002-2006) and externally validated (ie, temporally) in the second phase (2007-2008). Further validation was performed combining both temporal and site-specific validation.
Of the 152,279 assessed births, 7,279 (4.8%, 95% CI 4.7-4.9) had postpartum hemorrhage. All models had good-to-excellent discrimination. The extreme gradient boosting model had the best discriminative ability to predict postpartum hemorrhage (C statistic: 0.93; 95% CI 0.92-0.93), followed by random forest (C statistic: 0.92; 95% CI 0.91-0.92). The lasso regression model (C statistic: 0.87; 95% CI 0.86-0.88) and logistic regression (C statistic: 0.87; 95% CI 0.86-0.87) had lower-but-good discriminative ability. The above results held with validation across both time and sites. Decision curve analysis demonstrated that, although all models provided superior net benefit when clinical decision thresholds were between 0% and 80% predicted risk, the extreme gradient boosting model provided the greatest net benefit.
Postpartum hemorrhage on labor admission can be predicted with excellent discriminative ability using machine learning and statistical models. Further clinical application is needed, which may assist health care providers to be prepared and triage at-risk women.
使用机器学习和统计模型预测产妇在临产时发生产后出血的风险。
使用来自美国安全分娩研究联合会(2002-2008 年)的 10 个站点中的 10 个站点的数据构建并比较了预测模型,这些站点始终报告分娩时的估计失血量。结果为产后出血,定义为估计失血量至少为 1000mL。考虑了 55 个常规临产时可用的候选风险因素。我们使用逻辑回归和带或不带套索正则化(套索回归)的两种统计模型,以及随机森林和极端梯度增强作为两种机器学习模型来预测产后出血。通过 C 统计量(即一致性指数)、校准和决策曲线来衡量模型性能。模型是从第一阶段(2002-2006 年)构建的,并在第二阶段(2007-2008 年)进行了外部验证(即时间上)。进一步验证是通过同时进行时间和特定站点的验证来进行的。
在评估的 152279 例分娩中,有 7279 例(4.8%,95%CI4.7-4.9)发生了产后出血。所有模型的区分能力均较好。极端梯度增强模型对预测产后出血具有最佳的区分能力(C 统计量:0.93;95%CI0.92-0.93),其次是随机森林(C 统计量:0.92;95%CI0.91-0.92)。套索回归模型(C 统计量:0.87;95%CI0.86-0.88)和逻辑回归模型(C 统计量:0.87;95%CI0.86-0.87)的区分能力较低,但仍较好。以上结果在跨时间和站点的验证中均成立。决策曲线分析表明,尽管当临床决策阈值在 0%至 80%预测风险之间时,所有模型都提供了更高的净收益,但极端梯度增强模型提供了最大的净收益。
使用机器学习和统计模型可以极好地预测产妇在临产时的产后出血情况。需要进一步的临床应用,这可能有助于医疗保健提供者做好准备并对高危产妇进行分诊。