Ferguson Thomas, Ravani Pietro, Sood Manish M, Clarke Alix, Komenda Paul, Rigatto Claudio, Tangri Navdeep
Department of Internal Medicine, Max Rady College of Medicine, University of Manitoba, Winnipeg, Manitoba, Canada.
Seven Oaks Hospital Chronic Disease Innovation Centre, Winnipeg, Manitoba, Canada.
Kidney Int Rep. 2022 May 13;7(8):1772-1781. doi: 10.1016/j.ekir.2022.05.004. eCollection 2022 Aug.
Prediction of disease progression at all stages of chronic kidney disease (CKD) may help improve patient outcomes. As such, we aimed to develop and externally validate a random forest model to predict progression of CKD using demographics and laboratory data.
The model was developed in a population-based cohort from Manitoba, Canada, between April 1, 2006, and December 31, 2016, with external validation in Alberta, Canada. A total of 77,196 individuals with an estimated glomerular filtration rate (eGFR) > 10 ml/min per 1.73 m and a urine albumin-to-creatinine ratio (ACR) available were included from Manitoba and 107,097 from Alberta. We considered >80 laboratory features, including analytes from complete blood cell counts, chemistry panels, liver enzymes, urine analysis, and quantification of urine albumin and protein. The primary outcome in our study was a 40% decline in eGFR or kidney failure. We assessed model discrimination using the area under the receiver operating characteristic curve (AUC) and calibration using plots of observed and predicted risks.
The final model achieved an AUC of 0.88 (95% CI 0.87-0.89) at 2 years and 0.84 (0.83-0.85) at 5 years in internal testing. Discrimination and calibration were preserved in the external validation data set with AUC scores of 0.87 (0.86-0.88) at 2 years and 0.84 (0.84-0.86) at 5 years. The top 30% of individuals predicted as high risk and intermediate risk represent 87% of CKD progression events in 2 years and 77% of progression events in 5 years.
A machine learning model that leverages routinely collected laboratory data can predict eGFR decline or kidney failure with accuracy.
预测慢性肾脏病(CKD)各阶段的疾病进展可能有助于改善患者预后。因此,我们旨在开发并在外部验证一个随机森林模型,以利用人口统计学和实验室数据预测CKD的进展。
该模型是在加拿大曼尼托巴省基于人群的队列中开发的,时间跨度为2006年4月1日至2016年12月31日,并在加拿大艾伯塔省进行了外部验证。曼尼托巴省纳入了总共77196名估计肾小球滤过率(eGFR)>10 ml/min per 1.73 m²且有尿白蛋白与肌酐比值(ACR)数据的个体,艾伯塔省纳入了107097名。我们考虑了80多种实验室特征,包括全血细胞计数、化学分析、肝酶、尿液分析以及尿白蛋白和蛋白质定量的分析物。我们研究的主要结局是eGFR下降40%或肾衰竭。我们使用受试者操作特征曲线下面积(AUC)评估模型的辨别力,并使用观察到的风险与预测风险的图表评估校准情况。
在内部测试中,最终模型在2年时的AUC为0.88(95%CI 0.87 - 0.89),5年时为0.84(0.83 - 0.85)。在外部验证数据集中,辨别力和校准情况得以保留,2年时的AUC评分为0.87(0.86 - 0.88),5年时为0.84(0.84 - 0.86)。预测为高风险和中风险的个体中,前30%在2年内占CKD进展事件的87%,5年内占进展事件的77%。
一个利用常规收集的实验室数据的机器学习模型能够准确预测eGFR下降或肾衰竭。