Liu Shuhui, Fu Bo, Wang Wen, Liu Mei, Sun Xin
IEEE J Biomed Health Inform. 2022 Aug;26(8):4258-4269. doi: 10.1109/JBHI.2022.3171673. Epub 2022 Aug 11.
Sepsis is a systemic inflammatory response caused by pathogens such as bacteria. Because its pathogenesis is not clear, the clinical manifestations of patients vary greatly, and the alarming incidence and mortality pose a great threat to patients and medical systems, especially in the ICU (Intensive Care Unit). The traditional judgment criteria have the problem of low specificity. Artificial intelligence models could greatly improve the accuracy of sepsis prediction and judgment. Based on the XGBoost machine learning framework taking demographic, vital signs, laboratory tests and medical intervention data as input, this paper proposes a novel model for dynamically predicting sepsis and assessing risk. To realize the model, two methods for feature construction are introduced. For the observed time-series data of vital signs and laboratory tests, the time-dependent method performs to construct the time-dependent characteristics after the statistical screening. For the clinical intervention data, the statistical counting method is applied to construct count-dependent characteristics. Moreover, a new objective function is proposed for the XGBoost framework, and the first-order and second-order gradients of the objective function are also given for model training. Compared with the state-of-the-art methods at present, the proposed model has the best performance, with AUROC improved by 5.4% on the MIMIC-III dataset and 2.1% on PhysioNet Challenge 2019 dataset. The data processing and training methods of this model can be conveniently applied in different electronic health record systems and has a wide application prospect.
脓毒症是由细菌等病原体引起的全身性炎症反应。由于其发病机制尚不清楚,患者的临床表现差异很大,且惊人的发病率和死亡率对患者和医疗系统构成了巨大威胁,尤其是在重症监护病房(ICU)。传统的判断标准存在特异性低的问题。人工智能模型可以大大提高脓毒症预测和判断的准确性。基于以人口统计学、生命体征、实验室检查和医疗干预数据为输入的XGBoost机器学习框架,本文提出了一种动态预测脓毒症和评估风险的新模型。为实现该模型,介绍了两种特征构建方法。对于生命体征和实验室检查的观测时间序列数据,采用时间依赖方法在统计筛选后构建时间依赖特征。对于临床干预数据,应用统计计数方法构建计数依赖特征。此外,还为XGBoost框架提出了一个新的目标函数,并给出了目标函数的一阶和二阶梯度用于模型训练。与目前的先进方法相比,所提出的模型具有最佳性能,在MIMIC - III数据集上AUROC提高了5.4%,在2019年PhysioNet挑战赛数据集上提高了2.1%。该模型的数据处理和训练方法可以方便地应用于不同的电子健康记录系统,具有广阔的应用前景。