Li Qiang, Li Dongchen, Jiao He, Wu Zhenhua, Nie Weizhi
School of Microelectronics, Tianjin University, Tianjin, China.
School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China.
Front Cell Infect Microbiol. 2024 Nov 29;14:1488130. doi: 10.3389/fcimb.2024.1488130. eCollection 2024.
The early prediction of sepsis based on machine learning or deep learning has achieved good results.Most of the methods use structured data stored in electronic medical records, but the pathological characteristics of sepsis involve complex interactions between multiple physiological systems and signaling pathways, resulting in mixed structured data. Some researchers will introduce unstructured data when also introduce confounders. These confounders mask the direct causality of sepsis, leading the model to learn misleading correlations. Finally, it affects the generalization ability, robustness, and interpretability of the model.
To address this challenge, we propose an early sepsis prediction approach based on causal inference which can remove confounding effects and capture causal relationships. First, we analyze the relationship between each type of observation, confounder, and label to create a causal structure diagram. To eliminate the effects of different confounders separately, the methods of back-door adjustment and instrumental variable are used. Specifically, we learn the confounder and an instrumental variable based on mutual information from various observed data and eliminate the influence of the confounder by optimizing mutual information. We use back-door adjustment to eliminate the influence of confounders in clinical notes and static indicators on the true causal effect.
Our method, named CISepsis, was validated on the MIMIC-IV dataset. Compared to existing state-of-the-art early sepsis prediction models such as XGBoost, LSTM, and MGP-AttTCN, our method demonstrated a significant improvement in AUC. Specifically, our model achieved AUC values of 0.921, 0.920, 0.919, 0.923, 0.924, 0.926, and 0.926 at the 6, 5, 4, 3, 2, 1, and 0 time points, respectively. Furthermore, the effectiveness of our method was confirmed through ablation experiments.
Our method, based on causal inference, effectively removes the influence of confounding factors, significantly improving the predictive accuracy of the model. Compared to traditional methods, this adjustment allows for a more accurate capture of the true causal effects of sepsis, thereby enhancing the model's generalizability, robustness, and interpretability. Future research will explore the impact of specific indicators or treatment interventions on sepsis using counterfactual adjustments in causal inference, as well as investigate the potential clinical application of our method.
基于机器学习或深度学习的脓毒症早期预测已取得良好效果。大多数方法使用电子病历中存储的结构化数据,但脓毒症的病理特征涉及多个生理系统和信号通路之间的复杂相互作用,导致数据结构混合。一些研究人员在引入混杂因素时也会引入非结构化数据。这些混杂因素掩盖了脓毒症的直接因果关系,导致模型学习到误导性的相关性。最终,这会影响模型的泛化能力、稳健性和可解释性。
为应对这一挑战,我们提出一种基于因果推断的脓毒症早期预测方法,该方法可以消除混杂效应并捕捉因果关系。首先,我们分析每种观测类型、混杂因素和标签之间的关系,以创建因果结构图。为分别消除不同混杂因素的影响,使用了后门调整和工具变量方法。具体而言,我们基于来自各种观测数据的互信息学习混杂因素和一个工具变量,并通过优化互信息来消除混杂因素的影响。我们使用后门调整来消除临床记录和静态指标中的混杂因素对真实因果效应的影响。
我们的方法名为CISepsis,在MIMIC-IV数据集上得到了验证。与现有的先进脓毒症早期预测模型(如XGBoost、LSTM和MGP-AttTCN)相比,我们的方法在AUC上有显著提高。具体而言,我们的模型在6、5、4、3、2、1和0时间点的AUC值分别为0.921、0.920、0.919、0.923、0.924、0.926和0.926。此外,通过消融实验证实了我们方法的有效性。
我们基于因果推断的方法有效地消除了混杂因素的影响,显著提高了模型的预测准确性。与传统方法相比,这种调整能够更准确地捕捉脓毒症的真实因果效应,从而增强了模型的泛化能力、稳健性和可解释性。未来的研究将利用因果推断中的反事实调整来探索特定指标或治疗干预对脓毒症的影响,并研究我们方法的潜在临床应用。