Institute of Medicine, Chung Shan Medical University, Taichung 40201, Taiwan.
General Administrative Department, United Safety Medical Group, New Taipei City 24205, Taiwan.
Int J Environ Res Public Health. 2020 Jul 10;17(14):4973. doi: 10.3390/ijerph17144973.
Developing effective risk prediction models is a cost-effective approach to predicting complications of chronic kidney disease (CKD) and mortality rates; however, there is inadequate evidence to support screening for CKD. In this study, four data mining algorithms, including a classification and regression tree, a C4.5 decision tree, a linear discriminant analysis, and an extreme learning machine, are used to predict early CKD. The study includes datasets from 19,270 patients, provided by an adult health examination program from 32 chain clinics and three special physical examination centers, between 2015 and 2019. There were 11 independent variables, and the glomerular filtration rate (GFR) was used as the predictive variable. The C4.5 decision tree algorithm outperformed the three comparison models for predicting early CKD based on accuracy, sensitivity, specificity, and area under the curve metrics. It is, therefore, a promising method for early CKD prediction. The experimental results showed that Urine protein and creatinine ratio (UPCR), Proteinuria (PRO), Red blood cells (RBC), Glucose Fasting (GLU), Triglycerides (TG), Total Cholesterol (T-CHO), age, and gender are important risk factors. CKD care is closely related to primary care level and is recognized as a healthcare priority in national strategy. The proposed risk prediction models can support the important influence of personality and health examination representations in predicting early CKD.
开发有效的风险预测模型是预测慢性肾脏病(CKD)并发症和死亡率的一种具有成本效益的方法;然而,目前尚无足够的证据支持对 CKD 进行筛查。在这项研究中,使用了四种数据挖掘算法,包括分类和回归树、C4.5 决策树、线性判别分析和极限学习机,来预测早期 CKD。该研究包括来自 19270 名患者的数据,这些数据是由 2015 年至 2019 年期间 32 家连锁诊所和三家特殊体检中心的成人健康体检计划提供的。有 11 个独立变量,肾小球滤过率(GFR)被用作预测变量。C4.5 决策树算法在预测早期 CKD 方面的准确性、敏感性、特异性和曲线下面积指标均优于其他三种比较模型,因此是一种很有前途的早期 CKD 预测方法。实验结果表明,尿蛋白与肌酐比值(UPCR)、蛋白尿(PRO)、红细胞(RBC)、空腹血糖(GLU)、甘油三酯(TG)、总胆固醇(T-CHO)、年龄和性别是重要的危险因素。CKD 的护理与初级保健水平密切相关,在国家战略中被视为医疗保健的重点。所提出的风险预测模型可以支持人格和体检表现对预测早期 CKD 的重要影响。