Department of Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA.
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, USA.
J Am Med Inform Assoc. 2022 Sep 12;29(10):1696-1704. doi: 10.1093/jamia/ocac109.
Early identification of infection improves outcomes, but developing models for early identification requires determining infection status with manual chart review, limiting sample size. Therefore, we aimed to compare semi-supervised and transfer learning algorithms with algorithms based solely on manual chart review for identifying infection in hospitalized patients.
This multicenter retrospective study of admissions to 6 hospitals included "gold-standard" labels of infection from manual chart review and "silver-standard" labels from nonchart-reviewed patients using the Sepsis-3 infection criteria based on antibiotic and culture orders. "Gold-standard" labeled admissions were randomly allocated to training (70%) and testing (30%) datasets. Using patient characteristics, vital signs, and laboratory data from the first 24 hours of admission, we derived deep learning and non-deep learning models using transfer learning and semi-supervised methods. Performance was compared in the gold-standard test set using discrimination and calibration metrics.
The study comprised 432 965 admissions, of which 2724 underwent chart review. In the test set, deep learning and non-deep learning approaches had similar discrimination (area under the receiver operating characteristic curve of 0.82). Semi-supervised and transfer learning approaches did not improve discrimination over models fit using only silver- or gold-standard data. Transfer learning had the best calibration (unreliability index P value: .997, Brier score: 0.173), followed by self-learning gradient boosted machine (P value: .67, Brier score: 0.170).
Deep learning and non-deep learning models performed similarly for identifying infection, as did models developed using Sepsis-3 and manual chart review labels.
In a multicenter study of almost 3000 chart-reviewed patients, semi-supervised and transfer learning models showed similar performance for model discrimination as baseline XGBoost, while transfer learning improved calibration.
早期识别感染可改善预后,但开发早期识别模型需要通过手动图表审查来确定感染状态,这限制了样本量。因此,我们旨在比较半监督和迁移学习算法与仅基于手动图表审查的算法,以确定住院患者的感染情况。
这项多中心回顾性研究纳入了 6 家医院的住院患者,使用基于抗生素和培养医嘱的 Sepsis-3 感染标准,通过非图表审查的患者的“银标准”标签和手动图表审查的“金标准”标签来确定感染。“金标准”标记的入院患者被随机分配到训练(70%)和测试(30%)数据集。使用患者特征、生命体征和入院后 24 小时内的实验室数据,我们使用迁移学习和半监督方法从深度学习和非深度学习模型中得出结果。使用判别和校准指标在金标准测试集中比较性能。
该研究共纳入 432965 例住院患者,其中 2724 例进行了图表审查。在测试集中,深度学习和非深度学习方法的判别能力相似(接受者操作特征曲线下面积为 0.82)。半监督和迁移学习方法没有提高仅使用银标准或金标准数据拟合的模型的判别能力。迁移学习的校准效果最好(不可靠性指数 P 值:.997,Brier 评分:0.173),其次是自学习梯度提升机(P 值:.67,Brier 评分:0.170)。
用于识别感染的深度学习和非深度学习模型表现相似,使用 Sepsis-3 和手动图表审查标签开发的模型也是如此。
在一项涉及近 3000 例经过图表审查的患者的多中心研究中,半监督和迁移学习模型在模型判别方面的表现与基础 XGBoost 相似,而迁移学习提高了校准能力。