Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA; Sanford Research, Sioux Falls, SD 57104, USA.
Bioinformatics and Mathematical Biosciences Lab, Department of Mathematics and Statistics, South Dakota State University, Brookings, SD 57006, USA.
Math Biosci. 2019 Apr;310:24-30. doi: 10.1016/j.mbs.2019.02.001. Epub 2019 Feb 12.
Chronic kidney disease (CKD) is prevalent across the world, and kidney function is well defined by an estimated glomerular filtration rate (eGFR). The progression of kidney disease can be predicted if the future eGFR can be accurately estimated using predictive analytics. In this study, we developed and validated a prediction model of eGFR by data extracted from a regional health system. This dataset includes demographic, clinical and laboratory information from primary care clinics. The model was built using Random Forest regression and evaluated using Goodness-of-fit statistics and discrimination metrics. After data preprocessing, the patient cohort for model development and validation contained 61,740 patients. The final model included eGFR, age, gender, body mass index (BMI), obesity, hypertension, and diabetes, which achieved a mean coefficient of determination of 0.95. The estimated eGFRs were used to classify patients into CKD stages with high macro-averaged and micro-averaged metrics. In conclusion, a model using real-world electronic medical records (EMR) data can accurately predict future kidney functions and provide clinical decision support.
慢性肾脏病(CKD)在全球范围内普遍存在,估算肾小球滤过率(eGFR)可明确反映肾脏功能。如果能够通过预测分析准确估计未来的 eGFR,则可以预测肾脏病的进展。本研究通过从区域卫生系统提取的数据,开发并验证了一种 eGFR 预测模型。该数据集包含初级保健诊所的人口统计学、临床和实验室信息。该模型采用随机森林回归构建,并通过拟合优度统计量和区分度指标进行评估。在数据预处理后,模型开发和验证的患者队列包含 61740 名患者。最终模型包含 eGFR、年龄、性别、体重指数(BMI)、肥胖、高血压和糖尿病,其决定系数的平均值为 0.95。使用估计的 eGFR 将患者分为 CKD 各期,具有较高的宏观平均和微观平均指标。总之,使用真实世界的电子病历(EMR)数据的模型可以准确预测未来的肾脏功能,并提供临床决策支持。