Department of Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark.
Department of Hematology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.
Nat Commun. 2020 Jan 17;11(1):363. doi: 10.1038/s41467-019-14225-8.
Infections have become the major cause of morbidity and mortality among patients with chronic lymphocytic leukemia (CLL) due to immune dysfunction and cytotoxic CLL treatment. Yet, predictive models for infection are missing. In this work, we develop the CLL Treatment-Infection Model (CLL-TIM) that identifies patients at risk of infection or CLL treatment within 2 years of diagnosis as validated on both internal and external cohorts. CLL-TIM is an ensemble algorithm composed of 28 machine learning algorithms based on data from 4,149 patients with CLL. The model is capable of dealing with heterogeneous data, including the high rates of missing data to be expected in the real-world setting, with a precision of 72% and a recall of 75%. To address concerns regarding the use of complex machine learning algorithms in the clinic, for each patient with CLL, CLL-TIM provides explainable predictions through uncertainty estimates and personalized risk factors.
由于免疫功能障碍和细胞毒性 CLL 治疗,感染已成为慢性淋巴细胞白血病 (CLL) 患者发病率和死亡率的主要原因。然而,目前尚缺乏感染的预测模型。在这项工作中,我们开发了 CLL 治疗-感染模型 (CLL-TIM),该模型可识别出诊断后 2 年内有感染或 CLL 治疗风险的患者,在内部和外部队列中均得到验证。CLL-TIM 是一种由 28 种机器学习算法组成的集成算法,基于来自 4149 例 CLL 患者的数据。该模型能够处理异质数据,包括在真实环境中预期会出现的高缺失数据率,其精度为 72%,召回率为 75%。为了解决在临床中使用复杂机器学习算法的问题,对于每例 CLL 患者,CLL-TIM 通过不确定性估计和个性化风险因素提供可解释的预测。