Park Sehoon, Kim Yisak, Baek Chung Hee, Cho Hyunjeong, Park Ji In, Koh Eun Sil, Lee Jung Pyo, Park Sun-Hee, Kim Hyung Woo, Han Seung Hyeok, Chin Ho Jun, Kim Dong Ki, Moon Kyung Chul, Kim Young-Gon, Lee Hajeong
Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea.
Interdisciplinary Program in Bioengineering, Seoul National University Graduate School, Seoul, Republic of Korea.
Kidney Res Clin Pract. 2024 Sep 12. doi: 10.23876/j.krcp.23.212.
Immunoglobulin A nephropathy (IgAN) is a major cause of end-stage kidney disease (ESKD). The International IgA Nephropathy Prediction Tool (IIgAN-PT) predicts IgAN prognosis, but improvement in the prediction performance using machine learning (ML)-based methods is needed.
We analyzed 4,425 biopsy-confirmed patients with IgAN and ≥6 months of follow-up from nine tertiary university hospitals in Korea. The study population was divided into development and validation cohorts. Using the collected 87 clinicodemographic and pathological variables, ML-based prediction models for ESKD or estimated glomerular filtration rate were constructed: 1) the conventional CatBoost model, 2) the optimized CatBoost model with Cox proportional hazards, 3) the deep Cox proportional hazards model, and 4) the deep Cox mixture model. The area under the curve (AUC) and calibration plots were used to investigate the discriminative and calibration performance of the models, which were then compared with those of the IIgAN-PT full model.
The full model showed excellent performance (AUC [95% confidence interval] for 5-year outcome, 0.896 [0.8530.940]), with acceptable calibration results. The ML-based models showed good performance in predicting adverse kidney outcomes and revealed acceptable discrimination performance in the external validation (AUC [95% confidence interval] for the 5-year outcome: 1) 0.829 [0.791-0.866]; 2) 0.847 [0.804-0.890]; 3) 0.823 [0.784-0.862]; and 4) 0.832 [0.794-0.870]), although they underestimated the external validation cohort risks. With the validation data, the overall performance of the IIgAN-PT was non-inferior to that of the ML-based model. Conclusions: Our ML-based models showed good performance in predicting adverse kidney outcomes in patients with IgAN but they did not outperform the IIgAN-PT.
免疫球蛋白A肾病(IgAN)是终末期肾病(ESKD)的主要病因。国际IgA肾病预测工具(IIgAN-PT)可预测IgAN的预后,但需要采用基于机器学习(ML)的方法来提高预测性能。
我们分析了来自韩国9家三级大学医院的4425例经活检确诊的IgAN患者,这些患者的随访时间均≥6个月。研究人群被分为开发队列和验证队列。利用收集到的87个临床人口统计学和病理学变量,构建了基于ML的ESKD或估计肾小球滤过率预测模型:1)传统的CatBoost模型;2)采用Cox比例风险的优化CatBoost模型;3)深度Cox比例风险模型;4)深度Cox混合模型。采用曲线下面积(AUC)和校准图来研究模型的判别性能和校准性能,然后将其与IIgAN-PT完整模型的性能进行比较。
完整模型表现出优异的性能(5年结局的AUC[95%置信区间]为0.896[0.853 - 0.940]),校准结果可接受。基于ML的模型在预测不良肾脏结局方面表现良好,在外部验证中显示出可接受的判别性能(5年结局的AUC[95%置信区间]:1)0.829[0.791 - 0.866];2)0.847[0.804 - 0.890];3)0.823[0.784 - 0.862];4)0.832[0.794 - 0.870]),尽管它们低估了外部验证队列的风险。在校验数据方面,IIgAN-PT的整体性能不劣于基于ML的模型。结论:我们基于ML的模型在预测IgAN患者不良肾脏结局方面表现良好,但并未优于IIgAN-PT。