College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Centre for Distributed and High Performance Computing, School of Computer Science, The University of Sydney, Darlington, NSW, 2008, Australia.
Centre for Distributed and High Performance Computing, School of Computer Science, The University of Sydney, Darlington, NSW, 2008, Australia.
Comput Biol Med. 2023 Oct;165:107418. doi: 10.1016/j.compbiomed.2023.107418. Epub 2023 Sep 3.
Early detection of Sepsis is crucial for improving patient outcomes, as it is a significant public health concern that results in substantial morbidity and mortality. However, despite the widespread use of the Sequential Organ Failure Assessment (SOFA) in clinical settings to identify sepsis, obtaining sufficient physiological data before onset remains challenging, limiting early detection of sepsis. To address this challenge, we propose an interpretable machine learning model, ITFG (Interpretable Tree-based Feature Generation), that leverages potential correlations between features based on existing knowledge to identify sepsis within six hours of onset using valuable and continuous physiological measures. Furthermore, we introduce a Semi-supervised Attention-based Conditional Transfer Learning (SAC-TL) framework to enhance the model's generality and enable it to be used for early warning of sepsis in the target domain with less information from the source domain. Our proposed approaches effectively address the problem of systematic feature sparsity and missing data, while also being practical for different degrees of generalizability. We evaluated our proposed approaches on open datasets, MIMIC and PhysioNet, obtaining AUC of 97.98% and 86.21%, respectively, demonstrating their effectiveness in different data environments and achieving the best early detection results.
早期发现脓毒症对于改善患者预后至关重要,因为它是一个严重的公共卫生问题,会导致大量的发病率和死亡率。然而,尽管序贯器官衰竭评估(SOFA)在临床环境中被广泛用于识别脓毒症,但在发病前获得足够的生理数据仍然具有挑战性,限制了脓毒症的早期检测。为了解决这一挑战,我们提出了一种可解释的机器学习模型 ITFG(基于树的可解释特征生成),该模型利用基于现有知识的特征之间的潜在相关性,使用有价值且连续的生理测量指标,在发病后六小时内识别脓毒症。此外,我们引入了一种半监督基于注意力的条件迁移学习(SAC-TL)框架,以增强模型的通用性,并使其能够在目标域中使用来自源域的较少信息进行脓毒症的早期预警。我们提出的方法有效地解决了系统特征稀疏和数据缺失的问题,同时也具有不同程度的通用性。我们在公开数据集 MIMIC 和 PhysioNet 上评估了我们提出的方法,分别获得了 97.98%和 86.21%的 AUC,证明了它们在不同数据环境中的有效性,并取得了最佳的早期检测结果。