Division of Cardiology, Department of Medicine, UC San Diego, La Jolla, CA, USA.
University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
Eur J Heart Fail. 2020 Jan;22(1):139-147. doi: 10.1002/ejhf.1628. Epub 2019 Nov 12.
Predicting mortality is important in patients with heart failure (HF). However, current strategies for predicting risk are only modestly successful, likely because they are derived from statistical analysis methods that fail to capture prognostic information in large data sets containing multi-dimensional interactions.
We used a machine learning algorithm to capture correlations between patient characteristics and mortality. A model was built by training a boosted decision tree algorithm to relate a subset of the patient data with a very high or very low mortality risk in a cohort of 5822 hospitalized and ambulatory patients with HF. From this model we derived a risk score that accurately discriminated between low and high-risk of death by identifying eight variables (diastolic blood pressure, creatinine, blood urea nitrogen, haemoglobin, white blood cell count, platelets, albumin, and red blood cell distribution width). This risk score had an area under the curve (AUC) of 0.88 and was predictive across the full spectrum of risk. External validation in two separate HF populations gave AUCs of 0.84 and 0.81, which were superior to those obtained with two available risk scores in these same populations.
Using machine learning and readily available variables, we generated and validated a mortality risk score in patients with HF that was more accurate than other risk scores to which it was compared. These results support the use of this machine learning approach for the evaluation of patients with HF and in other settings where predicting risk has been challenging.
预测心力衰竭(HF)患者的死亡率很重要。然而,目前用于预测风险的策略仅取得了适度的成功,这可能是因为它们源自统计分析方法,这些方法无法捕捉到大包含多维相互作用的数据集中的预后信息。
我们使用机器学习算法来捕捉患者特征与死亡率之间的相关性。通过在 5822 名住院和门诊 HF 患者的队列中,训练一个增强决策树算法来将患者数据的子集与死亡率非常高或非常低的风险相关联,从而建立了一个模型。从这个模型中,我们得出了一个风险评分,可以通过识别八个变量(舒张压、肌酐、血尿素氮、血红蛋白、白细胞计数、血小板、白蛋白和红细胞分布宽度)来准确区分低风险和高风险的死亡。该风险评分的曲线下面积(AUC)为 0.88,并且可以预测整个风险谱。在两个独立的 HF 人群中的外部验证得到了 AUC 为 0.84 和 0.81,这优于在这些相同人群中可用的两个风险评分的 AUC。
我们使用机器学习和现成的变量生成并验证了 HF 患者的死亡率风险评分,与进行比较的其他风险评分相比,该评分更为准确。这些结果支持在 HF 患者的评估以及在其他预测风险具有挑战性的环境中使用这种机器学习方法。