Rheumatology Department, Hospital Clínical San Carlos, and IdISSC, Madríd, Spain.
International Centre for Numerical Methods in Engineering (CIMNE), Madrid, Spain.
Sci Rep. 2017 Aug 31;7(1):10189. doi: 10.1038/s41598-017-10558-w.
We developed and independently validated a rheumatoid arthritis (RA) mortality prediction model using the machine learning method Random Survival Forests (RSF). Two independent cohorts from Madrid (Spain) were used: the Hospital Clínico San Carlos RA Cohort (HCSC-RAC; training; 1,461 patients), and the Hospital Universitario de La Princesa Early Arthritis Register Longitudinal study (PEARL; validation; 280 patients). Demographic and clinical-related variables collected during the first two years after disease diagnosis were used. 148 and 21 patients from HCSC-RAC and PEARL died during a median follow-up time of 4.3 and 5.0 years, respectively. Age at diagnosis, median erythrocyte sedimentation rate, and number of hospital admissions showed the higher predictive capacity. Prediction errors in the training and validation cohorts were 0.187 and 0.233, respectively. A survival tree identified five mortality risk groups using the predicted ensemble mortality. After 1 and 7 years of follow-up, time-dependent specificity and sensitivity in the validation cohort were 0.79-0.80 and 0.43-0.48, respectively, using the cut-off value dividing the two lower risk categories. Calibration curves showed overestimation of the mortality risk in the validation cohort. In conclusion, we were able to develop a clinical prediction model for RA mortality using RSF, providing evidence for further work on external validation.
我们使用机器学习方法随机生存森林(RSF)开发并独立验证了一个类风湿关节炎(RA)死亡率预测模型。使用了来自马德里(西班牙)的两个独立队列:马德里圣卡洛斯临床医院 RA 队列(HCSC-RAC;训练;1461 例患者)和拉·普林西佩医院早期关节炎登记纵向研究(PEARL;验证;280 例患者)。使用疾病诊断后头两年收集的人口统计学和临床相关变量。HCSC-RAC 和 PEARL 队列中分别有 148 名和 21 名患者在中位随访时间为 4.3 年和 5.0 年时死亡。诊断时的年龄、平均红细胞沉降率和住院次数显示出更高的预测能力。在训练和验证队列中的预测误差分别为 0.187 和 0.233。生存树使用预测的总体死亡率确定了五个死亡风险组。在验证队列中,使用将两个较低风险类别分开的截断值,1 年和 7 年的随访时间特异性和敏感性分别为 0.79-0.80 和 0.43-0.48。校准曲线显示验证队列中高估了死亡率风险。总之,我们能够使用 RSF 为 RA 死亡率开发临床预测模型,为进一步进行外部验证提供了证据。