AlSaad Rawan, Malluhi Qutaibah, Boughorbel Sabri
College of Engineering, Qatar University, Doha, Qatar.
Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar.
BioData Min. 2022 Feb 14;15(1):6. doi: 10.1186/s13040-022-00289-8.
Early identification of pregnant women at risk for preterm birth (PTB), a major cause of infant mortality and morbidity, has a significant potential to improve prenatal care. However, we lack effective predictive models which can accurately forecast PTB and complement these predictions with appropriate interpretations for clinicians. In this work, we introduce a clinical prediction model (PredictPTB) which combines variables (medical codes) readily accessible through electronic health record (EHR) to accurately predict the risk of preterm birth at 1, 3, 6, and 9 months prior to delivery.
The architecture of PredictPTB employs recurrent neural networks (RNNs) to model the longitudinal patient's EHR visits and exploits a single code-level attention mechanism to improve the predictive performance, while providing temporal code-level and visit-level explanations for the prediction results. We compare the performance of different combinations of prediction time-points, data modalities, and data windows. We also present a case-study of our model's interpretability illustrating how clinicians can gain some transparency into the predictions.
Leveraging a large cohort of 222,436 deliveries, comprising a total of 27,100 unique clinical concepts, our model was able to predict preterm birth with an ROC-AUC of 0.82, 0.79, 0.78, and PR-AUC of 0.40, 0.31, 0.24, at 1, 3, and 6 months prior to delivery, respectively. Results also confirm that observational data modalities (such as diagnoses) are more predictive for preterm birth than interventional data modalities (e.g., medications and procedures).
Our results demonstrate that PredictPTB can be utilized to achieve accurate and scalable predictions for preterm birth, complemented by explanations that directly highlight evidence in the patient's EHR timeline.
早产是婴儿死亡和发病的主要原因,早期识别有早产风险的孕妇对于改善产前护理具有巨大潜力。然而,我们缺乏能够准确预测早产并为临床医生提供适当解释以辅助这些预测的有效预测模型。在这项工作中,我们引入了一种临床预测模型(PredictPTB),该模型结合了通过电子健康记录(EHR)易于获取的变量(医学编码),以准确预测分娩前1、3、6和9个月的早产风险。
PredictPTB的架构采用循环神经网络(RNN)对患者的纵向EHR就诊情况进行建模,并利用单一的编码级注意力机制来提高预测性能,同时为预测结果提供时间编码级和就诊级的解释。我们比较了不同预测时间点、数据模态和数据窗口组合的性能。我们还展示了一个关于我们模型可解释性的案例研究,说明临床医生如何能够对预测有一定的透明度。
利用包含总共27,100个独特临床概念的222,436例分娩的大型队列,我们的模型能够分别在分娩前1、3和6个月预测早产,其ROC-AUC分别为0.82、0.79、0.78,PR-AUC分别为0.40、0.31、0.24。结果还证实,观察性数据模态(如诊断)比干预性数据模态(如药物和手术)对早产的预测性更强。
我们的结果表明,PredictPTB可用于实现对早产的准确且可扩展的预测,并辅以直接突出患者EHR时间线中证据的解释。