Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
Int J Lab Hematol. 2021 Dec;43(6):1341-1356. doi: 10.1111/ijlh.13549. Epub 2021 May 4.
Early diagnosis and antibiotic administration are essential for reducing sepsis morbidity and mortality; however, diagnosis remains difficult due to complex pathogenesis and presentation. We created a machine learning model for bacterial sepsis identification in the neonatal intensive care unit (NICU) using hematological analyzer data.
Hematological analyzer data were gathered from NICU patients up to 48 hours prior to clinical evaluation for bacterial sepsis. Five models, Support Vector Machine, K-nearest-neighbors, Logistic Regression, Random Forest (RF), and Extreme Gradient boosting (XGBoost), were trained on 60 hematological and nine clinical variables for 2357 cases (1692 control, 665 septic). Clinical feature only models (nine variables) were additionally trained and compared with models including hematological variables. Feature importance was used to assess relative contributions of parameters to performance.
The three best performing models were RF, Logistic Regression, and XGBoost. RF achieved an average accuracy of 0.74, AUC-ROC of 0.73, Sensitivity of 0.38, and Specificity of 0.88. Logistic Regression achieved an average accuracy of 0.70, AUC-ROC of 0.74, Sensitivity of 0.62, and Specificity of 0.73. XGBoost achieved an average accuracy of 0.72, AUC-ROC of 0.71, Sensitivity of 0.40, and Specificity of 0.85. All models with hematological variables had significantly stronger performance than models trained on only clinical features. Neutrophil parameters had the highest average feature importance.
Machine learning models using hematological analyzer data can classify NICU patients as sepsis positive or negative with stronger performance compared to clinical feature only models. Hematological analyzer variables could augment current sepsis classification machine learning algorithms.
早期诊断和抗生素治疗对于降低脓毒症的发病率和死亡率至关重要;然而,由于复杂的发病机制和临床表现,诊断仍然具有挑战性。我们使用血液分析仪数据创建了一个用于新生儿重症监护病房(NICU)中细菌性败血症识别的机器学习模型。
从 NICU 患者在临床评估细菌性败血症前 48 小时内收集血液分析仪数据。使用支持向量机、K 最近邻、逻辑回归、随机森林(RF)和极端梯度提升(XGBoost)五种模型对 60 个血液学和 9 个临床变量进行训练,共 2357 例(1692 例对照,665 例脓毒症)。还训练了仅包含临床特征的模型(9 个变量),并与包含血液学变量的模型进行了比较。使用特征重要性来评估参数对性能的相对贡献。
表现最好的三个模型是 RF、逻辑回归和 XGBoost。RF 的平均准确率为 0.74,AUC-ROC 为 0.73,敏感性为 0.38,特异性为 0.88。逻辑回归的平均准确率为 0.70,AUC-ROC 为 0.74,敏感性为 0.62,特异性为 0.73。XGBoost 的平均准确率为 0.72,AUC-ROC 为 0.71,敏感性为 0.40,特异性为 0.85。所有包含血液学变量的模型的性能均明显优于仅基于临床特征的模型。中性粒细胞参数的平均特征重要性最高。
使用血液分析仪数据的机器学习模型可以对 NICU 患者进行分类,确定其是否为败血症阳性或阴性,与仅基于临床特征的模型相比,性能更强。血液分析仪变量可以增强当前的败血症分类机器学习算法。