Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA.
Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA.
J Am Med Inform Assoc. 2021 Dec 28;29(1):72-79. doi: 10.1093/jamia/ocab229.
Hospital-acquired infections (HAIs) are associated with significant morbidity, mortality, and prolonged hospital length of stay. Risk prediction models based on pre- and intraoperative data have been proposed to assess the risk of HAIs at the end of the surgery, but the performance of these models lag behind HAI detection models based on postoperative data. Postoperative data are more predictive than pre- or interoperative data since it is closer to the outcomes in time, but it is unavailable when the risk models are applied (end of surgery). The objective is to study whether such data, which is temporally unavailable at prediction time (TUP) (and thus cannot directly enter the model), can be used to improve the performance of the risk model.
An extensive array of 12 methods based on logistic/linear regression and deep learning were used to incorporate the TUP data using a variety of intermediate representations of the data. Due to the hierarchical structure of different HAI outcomes, a comparison of single and multi-task learning frameworks is also presented.
The use of TUP data was always advantageous as baseline methods, which cannot utilize TUP data, never achieved the top performance. The relative performances of the different models vary across the different outcomes. Regarding the intermediate representation, we found that its complexity was key and that incorporating label information was helpful.
Using TUP data significantly helped predictive performance irrespective of the model complexity.
医院获得性感染(HAI)与显著的发病率、死亡率和延长住院时间有关。已经提出了基于术前和术中数据的风险预测模型,以评估手术结束时 HAI 的风险,但这些模型的性能落后于基于术后数据的 HAI 检测模型。术后数据比术前或术中数据更具预测性,因为它更接近时间上的结果,但在应用风险模型时(手术结束时)不可用。目的是研究在预测时间(TUP)不可用时(因此不能直接输入模型)的此类数据是否可以用于提高风险模型的性能。
使用了基于逻辑/线性回归和深度学习的 12 种广泛的方法,使用数据的各种中间表示来合并 TUP 数据。由于不同 HAI 结果的层次结构,还提出了单任务和多任务学习框架的比较。
使用 TUP 数据始终是有利的,因为基线方法不能利用 TUP 数据,从未达到最佳性能。不同模型的相对性能因不同的结果而异。关于中间表示,我们发现其复杂性是关键,并且包含标签信息是有帮助的。
无论模型的复杂性如何,使用 TUP 数据都显著有助于提高预测性能。