Aminnejad Neda, Greiver Michelle, Huang Huaxiong
Department of Mathematics & Statistics, York University, Toronto, Canada.
Department of Family & Community Medicine, University of Toronto, Toronto, Canada.
PLOS Digit Health. 2025 Jan 22;4(1):e0000700. doi: 10.1371/journal.pdig.0000700. eCollection 2025 Jan.
Chronic kidney disease (CKD) affects over 13% of the population, totaling more than 800 million individuals worldwide. Timely identification and intervention are crucial to delay CKD progression and improve patient outcomes. This research focuses on developing a predictive model to classify diabetic patients showing signs of kidney function impairment based on their CKD development risk. Our model utilizes electronic medical record (EMR) data, specifically by incorporating patient demographics, laboratory results, chronic conditions, risk factors, and medication codes to predict the onset of CKD in diabetic patients six months in advance, achieving an average Area Under the Curve (AUC) of 0.88. We leverage aggregated EMR data to effectively capture relevant information within the observation year instead of using temporal EMR data. Furthermore, we identify the most significant features for predicting CKD onset, including mean, minimum, and first quartile of estimated glomerular filtration rate (eGFR) during the observation year, along with variables such as diagnosis age and duration of hypertension, osteoarthritis, and diabetes, as well as levels of hemoglobin and fasting blood glucose (FBG). We also explored a refined model utilizing only these most significant features, which yields a slightly lower AUC of 0.86. These variables are typically available in primary data, empowering physicians for real-time risk assessment. The proposed model's ability to identify higher-risk patients is essential for timely intervention, personalized care, risk stratification, patient education, and potential cost savings. This research contributes valuable insights for healthcare practitioners seeking efficient tools for early CKD detection in diabetic populations.
慢性肾脏病(CKD)影响着超过13%的人口,全球总数超过8亿人。及时识别和干预对于延缓CKD进展及改善患者预后至关重要。本研究聚焦于开发一种预测模型,以根据糖尿病患者的CKD发展风险对出现肾功能损害迹象的患者进行分类。我们的模型利用电子病历(EMR)数据,具体而言,通过纳入患者人口统计学信息、实验室检查结果、慢性病、风险因素和用药代码,提前六个月预测糖尿病患者CKD的发病情况,平均曲线下面积(AUC)达到0.88。我们利用汇总的EMR数据来有效捕捉观察年内的相关信息,而非使用时间序列EMR数据。此外,我们确定了预测CKD发病的最显著特征,包括观察年内估计肾小球滤过率(eGFR)的均值、最小值和第一四分位数,以及诸如诊断年龄、高血压、骨关节炎和糖尿病的病程,还有血红蛋白和空腹血糖(FBG)水平等变量。我们还探索了仅使用这些最显著特征的优化模型,其AUC略低,为0.86。这些变量通常在原始数据中可得,使医生能够进行实时风险评估。所提出模型识别高风险患者的能力对于及时干预、个性化护理、风险分层、患者教育以及潜在的成本节约至关重要。本研究为寻求在糖尿病患者中早期检测CKD的有效工具的医疗从业者提供了有价值的见解。