Department of Nephrology, Huadong Hospital Affiliated To Fudan University, Shanghai, 200040, China.
Shanghai Key Laboratory of Clinical Geriatric Medicine, Huadong Hospital Affiliated To Fudan University, Shanghai, 200040, China.
J Transl Med. 2019 Apr 11;17(1):119. doi: 10.1186/s12967-019-1860-0.
Urinary protein quantification is critical for assessing the severity of chronic kidney disease (CKD). However, the current procedure for determining the severity of CKD is completed through evaluating 24-h urinary protein, which is inconvenient during follow-up.
To quickly predict the severity of CKD using more easily available demographic and blood biochemical features during follow-up, we developed and compared several predictive models using statistical, machine learning and neural network approaches.
The clinical and blood biochemical results from 551 patients with proteinuria were collected. Thirteen blood-derived tests and 5 demographic features were used as non-urinary clinical variables to predict the 24-h urinary protein outcome response. Nine predictive models were established and compared, including logistic regression, Elastic Net, lasso regression, ridge regression, support vector machine, random forest, XGBoost, neural network and k-nearest neighbor. The AU-ROC, sensitivity (recall), specificity, accuracy, log-loss and precision of each of the models were evaluated. The effect sizes of each variable were analysed and ranked.
The linear models including Elastic Net, lasso regression, ridge regression and logistic regression showed the highest overall predictive power, with an average AUC and a precision above 0.87 and 0.8, respectively. Logistic regression ranked first, reaching an AUC of 0.873, with a sensitivity and specificity of 0.83 and 0.82, respectively. The model with the highest sensitivity was Elastic Net (0.85), while XGBoost showed the highest specificity (0.83). In the effect size analyses, we identified that ALB, Scr, TG, LDL and EGFR had important impacts on the predictability of the models, while other predictors such as CRP, HDL and SNA were less important.
Blood-derived tests could be applied as non-urinary predictors during outpatient follow-up. Features in routine blood tests, including ALB, Scr, TG, LDL and EGFR levels, showed predictive ability for CKD severity. The developed online tool can facilitate the prediction of proteinuria progress during follow-up in clinical practice.
尿蛋白定量对于评估慢性肾脏病(CKD)的严重程度至关重要。然而,目前确定 CKD 严重程度的过程是通过评估 24 小时尿蛋白来完成的,这在随访过程中很不方便。
为了使用在随访过程中更易获得的人口统计学和血液生化特征快速预测 CKD 的严重程度,我们使用统计、机器学习和神经网络方法开发并比较了几种预测模型。
收集了 551 例蛋白尿患者的临床和血液生化结果。将 13 项血液检测和 5 项人口统计学特征作为非尿临床变量用于预测 24 小时尿蛋白的反应结果。建立并比较了 9 种预测模型,包括逻辑回归、弹性网络、套索回归、岭回归、支持向量机、随机森林、XGBoost、神经网络和 K 最近邻。评估了每个模型的 AU-ROC、敏感性(召回率)、特异性、准确性、对数损失和精度。分析和排序了每个变量的效应大小。
包括弹性网络、套索回归、岭回归和逻辑回归在内的线性模型显示出最高的总体预测能力,平均 AUC 和精度均高于 0.87 和 0.8。逻辑回归排名第一,AUC 为 0.873,敏感性和特异性分别为 0.83 和 0.82。敏感性最高的模型是弹性网络(0.85),而 XGBoost 的特异性最高(0.83)。在效应大小分析中,我们确定了 ALB、Scr、TG、LDL 和 EGFR 对模型的可预测性有重要影响,而 CRP、HDL 和 SNA 等其他预测因子则不太重要。
血液检测结果可作为门诊随访期间的非尿预测因子。常规血液检测中的特征,包括 ALB、Scr、TG、LDL 和 EGFR 水平,对 CKD 严重程度具有预测能力。开发的在线工具可以方便地在临床实践中预测随访期间蛋白尿的进展。