Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, Guangdong, China.
Department of Nephrology, Jiujiang Hospital of Traditional Chinese Medicine, Jiujiang, Jiangxi, China.
PLoS One. 2022 Mar 9;17(3):e0265017. doi: 10.1371/journal.pone.0265017. eCollection 2022.
Immunoglobulin a nephropathy (IgAN) is the most common primary glomerular disease in the world, with different clinical manifestations, varying severity of pathological changes, common complications of crescent formation in different proportions, and great individual heterogeneous in clinical outcomes. Therefore, we aim to develop a machine learning (ML) based predictive model for predicting the prognosis of IgAN with focal crescent formation and without obvious chronic renal lesions (glomerulosclerosis <25%).
We retrospectively reviewed biopsy-proven IgAN patients in our hospital and cooperative hospital from 2005 to 2017. The method of feature importance of random forest (RF) was applied to conduct feature exploration of feature variables to establish the characteristic variables that are closely related to the prognosis of focal crescent IgAN. Multiple ML algorithms were attempted to establish the prediction models. The area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC) were applied to evaluate the predictive performance via three-fold cross validation (namely 2 training sets and 1 validation set).
RF was used to screen the important features, the top three of which were baseline estimated glomerular filtration rate (eGFR), serum creatine and triglyceride. Ten important features were selected as important predictors for modeling on the basis of data-driven and medical selection, predictors include: age, baseline eGFR, serum creatine, serum triglycerides, complement 3(C3), proteinuria, mean arterial pressure (MAP) and Hematuria, crescents proportion of glomeruli, Global crescent proportion of glomeruli. In a variety of ML algorithms, the support vector machine (SVM) algorithm displayed better predictive performance, with Precision of 0.77, Recall of 0.77, F1-score of 0.73, accuracy of 0.77, AUROC of 79.57%, and AUPRC of 76.5%.
The SVM model is potentially useful for predicting the prognosis of IgAN patients with focal crescent shape and without obvious chronic renal lesions.
免疫球蛋白 A 肾病(IgAN)是世界上最常见的原发性肾小球疾病,具有不同的临床表现、不同程度的病理改变、不同比例的新月体形成常见并发症以及临床结局的个体差异很大等特点。因此,我们旨在建立一种基于机器学习(ML)的预测模型,用于预测具有局灶性新月体形成且无明显慢性肾脏病变(肾小球硬化<25%)的 IgAN 患者的预后。
我们回顾性分析了 2005 年至 2017 年我院及合作医院经活检证实的 IgAN 患者。采用随机森林(RF)特征重要性方法对特征变量进行特征探索,建立与局灶性新月体 IgAN 预后密切相关的特征变量。尝试了多种 ML 算法来建立预测模型。通过三折交叉验证(即 2 个训练集和 1 个验证集),应用精度-召回率曲线下面积(AUPRC)和接收者操作特征曲线下面积(AUROC)来评估预测性能。
RF 用于筛选重要特征,前 3 个特征为基线估算肾小球滤过率(eGFR)、血清肌酐和甘油三酯。基于数据驱动和医学选择,选择 10 个重要特征作为建模的重要预测因子,预测因子包括:年龄、基线 eGFR、血清肌酐、血清甘油三酯、补体 3(C3)、蛋白尿、平均动脉压(MAP)和血尿、肾小球新月体比例、肾小球整体新月体比例。在各种 ML 算法中,支持向量机(SVM)算法显示出更好的预测性能,其精度为 0.77、召回率为 0.77、F1 得分为 0.73、准确性为 0.77、AUROC 为 79.57%、AUPRC 为 76.5%。
SVM 模型可能有助于预测具有局灶性新月体形状且无明显慢性肾脏病变的 IgAN 患者的预后。