State Key Laboratory of Digital Medical Engineering, School of Instrument Science and Engineering, Southeast University, Nanjing, China.
Emergency Intensive Care Unit (EICU), The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou, China.
J Med Internet Res. 2024 Sep 4;26:e54621. doi: 10.2196/54621.
Sepsis is a heterogeneous syndrome, and enrollment of more homogeneous patients is essential to improve the efficiency of clinical trials. Artificial intelligence (AI) has facilitated the identification of homogeneous subgroups, but how to estimate the uncertainty of the model outputs when applying AI to clinical decision-making remains unknown.
We aimed to design an AI-based model for purposeful patient enrollment, ensuring that a patient with sepsis recruited into a trial would still be persistently ill by the time the proposed therapy could impact patient outcome. We also expected that the model could provide interpretable factors and estimate the uncertainty of the model outputs at a customized confidence level.
In this retrospective study, 9135 patients with sepsis requiring vasopressor treatment within 24 hours after sepsis onset were enrolled from Beth Israel Deaconess Medical Center. This cohort was used for model development, and 10-fold cross-validation with 50 repeats was used for internal validation. In total, 3743 patients with sepsis from the eICU Collaborative Research Database were used as the external validation cohort. All included patients with sepsis were stratified based on disease progression trajectories: rapid death, recovery, and persistent ill. A total of 148 variables were selected for predicting the 3 trajectories. Four machine learning algorithms with 3 different setups were used. We estimated the uncertainty of the model outputs using conformal prediction (CP). The Shapley Additive Explanations method was used to explain the model.
The multiclass gradient boosting machine was identified as the best-performing model with good discrimination and calibration performance in both validation cohorts. The mean area under the receiver operating characteristic curve with SD was 0.906 (0.018) for rapid death, 0.843 (0.008) for recovery, and 0.807 (0.010) for persistent ill in the internal validation cohort. In the external validation cohort, the mean area under the receiver operating characteristic curve (SD) was 0.878 (0.003) for rapid death, 0.764 (0.008) for recovery, and 0.696 (0.007) for persistent ill. The maximum norepinephrine equivalence, total urine output, Acute Physiology Score III, mean systolic blood pressure, and the coefficient of variation of oxygen saturation contributed the most. Compared to the model without CP, using the model with CP at a mixed confidence approach reduced overall prediction errors by 27.6% (n=62) and 30.7% (n=412) in the internal and external validation cohorts, respectively, as well as enabled the identification of more potentially persistent ill patients.
The implementation of our model has the potential to reduce heterogeneity and enroll more homogeneous patients in sepsis clinical trials. The use of CP for estimating the uncertainty of the model outputs allows for a more comprehensive understanding of the model's reliability and assists in making informed decisions based on the predicted outcomes.
脓毒症是一种异质性综合征,招募更同质的患者对于提高临床试验的效率至关重要。人工智能(AI)已促进了同质亚组的识别,但在将 AI 应用于临床决策时,如何估计模型输出的不确定性仍然未知。
我们旨在设计一种基于 AI 的患者招募模型,以确保招募到试验中的脓毒症患者在提出的治疗方案可能影响患者结局时仍处于持续疾病状态。我们还期望该模型能够提供可解释的因素,并在定制置信水平下估计模型输出的不确定性。
在这项回顾性研究中,我们从贝斯以色列女执事医疗中心招募了 9135 例脓毒症发病后 24 小时内需要升压治疗的患者。该队列用于模型开发,并使用 50 次重复的 10 折交叉验证进行内部验证。总共从 eICU 协作研究数据库中招募了 3743 例脓毒症患者作为外部验证队列。所有纳入的脓毒症患者根据疾病进展轨迹进行分层:快速死亡、恢复和持续疾病状态。共选择了 148 个变量来预测这 3 个轨迹。使用 4 种具有 3 种不同设置的机器学习算法。我们使用一致性预测(CP)来估计模型输出的不确定性。Shapley 加法解释方法用于解释模型。
在内部和外部验证队列中,多类梯度提升机被确定为表现最佳的模型,在两个验证队列中均具有良好的区分度和校准性能。内部验证队列中快速死亡、恢复和持续疾病状态的平均接收者操作特征曲线下面积(SD)分别为 0.906(0.018)、0.843(0.008)和 0.807(0.010)。外部验证队列中快速死亡、恢复和持续疾病状态的平均接收者操作特征曲线下面积(SD)分别为 0.878(0.003)、0.764(0.008)和 0.696(0.007)。最大去甲肾上腺素当量、总尿量、急性生理学评分 III、平均收缩压和氧饱和度变异系数的贡献最大。与不使用 CP 的模型相比,在内部和外部验证队列中,使用 CP 进行混合置信度方法可分别减少整体预测误差 27.6%(n=62)和 30.7%(n=412),并且能够识别出更多可能持续疾病状态的患者。
我们的模型的实施有可能减少脓毒症临床试验中的异质性并招募更多同质的患者。使用 CP 估计模型输出的不确定性可以更全面地了解模型的可靠性,并有助于根据预测结果做出明智的决策。