Mboya Innocent B, Mahande Michael J, Mohammed Mohanad, Obure Joseph, Mwambi Henry G
School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, KwaZulu-Natal, South Africa
Department of Epidemiology and Biostatistics, Institute of Public Health, Kilimanjaro Christian Medical University College, Moshi, Tanzania.
BMJ Open. 2020 Oct 19;10(10):e040132. doi: 10.1136/bmjopen-2020-040132.
We aimed to determine the key predictors of perinatal deaths using machine learning models compared with the logistic regression model.
A secondary data analysis using the Kilimanjaro Christian Medical Centre (KCMC) Medical Birth Registry cohort from 2000 to 2015. We assessed the discriminative ability of models using the area under the receiver operating characteristics curve (AUC) and the net benefit using decision curve analysis.
The KCMC is a zonal referral hospital located in Moshi Municipality, Kilimanjaro region, Northern Tanzania. The Medical Birth Registry is within the hospital grounds at the Reproductive and Child Health Centre.
Singleton deliveries (n=42 319) with complete records from 2000 to 2015.
Perinatal death (composite of stillbirths and early neonatal deaths). These outcomes were only captured before mothers were discharged from the hospital.
The proportion of perinatal deaths was 3.7%. There were no statistically significant differences in the predictive performance of four machine learning models except for bagging, which had a significantly lower performance (AUC 0.76, 95% CI 0.74 to 0.79, p=0.006) compared with the logistic regression model (AUC 0.78, 95% CI 0.76 to 0.81). However, in the decision curve analysis, the machine learning models had a higher net benefit (ie, the correct classification of perinatal deaths considering a trade-off between false-negatives and false-positives)-over the logistic regression model across a range of threshold probability values.
In this cohort, there was no significant difference in the prediction of perinatal deaths between machine learning and logistic regression models, except for bagging. The machine learning models had a higher net benefit, as its predictive ability of perinatal death was considerably superior over the logistic regression model. The machine learning models, as demonstrated by our study, can be used to improve the prediction of perinatal deaths and triage for women at risk.
我们旨在通过机器学习模型确定围产期死亡的关键预测因素,并与逻辑回归模型进行比较。
对2000年至2015年乞力马扎罗基督教医疗中心(KCMC)医疗出生登记队列进行二次数据分析。我们使用受试者工作特征曲线下面积(AUC)评估模型的判别能力,并使用决策曲线分析评估净效益。
KCMC是位于坦桑尼亚北部乞力马扎罗地区莫希市的一家区域转诊医院。医疗出生登记处在医院内的生殖与儿童健康中心。
2000年至2015年有完整记录的单胎分娩(n = 42319例)。
围产期死亡(死产和早期新生儿死亡的综合情况)。这些结局仅在母亲出院前记录。
围产期死亡比例为3.7%。除了装袋法外,四种机器学习模型的预测性能没有统计学上的显著差异,与逻辑回归模型(AUC 0.78,95%CI 0.76至0.81)相比,装袋法的性能显著较低(AUC 0.76,95%CI 0.74至0.79,p = 0.006)。然而,在决策曲线分析中,在一系列阈值概率值范围内,机器学习模型的净效益高于逻辑回归模型(即,在考虑假阴性和假阳性之间权衡的情况下对围产期死亡进行正确分类)。
在该队列中,除装袋法外,机器学习模型和逻辑回归模型在围产期死亡预测方面没有显著差异。机器学习模型的净效益更高,因为其对围产期死亡的预测能力明显优于逻辑回归模型。正如我们的研究所表明的,机器学习模型可用于改善围产期死亡的预测和对高危女性的分诊。