Oh Tae Ryom, Song Su Hyun, Choi Hong Sang, Suh Sang Heon, Kim Chang Seong, Jung Ji Yong, Choi Kyu Hun, Oh Kook-Hwan, Ma Seong Kwon, Bae Eun Hui, Kim Soo Wan
Department of Internal Medicine, Chonnam National University Hospital, Gwangju 61469, Korea.
Department of Internal Medicine, Gachon University of Medicine and Science, Incheon 21565, Korea.
J Pers Med. 2021 Dec 15;11(12):1372. doi: 10.3390/jpm11121372.
Cardiovascular disease is a major complication of chronic kidney disease. The coronary artery calcium (CAC) score is a surrogate marker for the risk of coronary artery disease. The purpose of this study is to predict outcomes for non-dialysis chronic kidney disease patients under the age of 60 with high CAC scores using machine learning techniques. We developed the predictive models with a chronic kidney disease representative cohort, the Korean Cohort Study for Outcomes in Patients with Chronic Kidney Disease (KNOW-CKD). We divided the cohort into a training dataset (70%) and a validation dataset (30%). The test dataset incorporated an external dataset of patients that were not included in the KNOW-CKD cohort. Support vector machine, random forest, XGboost, logistic regression, and multi-perceptron neural network models were used in the predictive models. We evaluated the model's performance using the area under the receiver operating characteristic (AUROC) curve. Shapley additive explanation values were applied to select the important features. The random forest model showed the best predictive performance (AUROC 0.87) and there was a statistically significant difference between the traditional logistic regression model and the test dataset. This study will help identify patients at high risk of cardiovascular complications in young chronic kidney disease and establish individualized treatment strategies.
心血管疾病是慢性肾脏病的主要并发症。冠状动脉钙化(CAC)评分是冠状动脉疾病风险的替代标志物。本研究的目的是使用机器学习技术预测60岁以下、CAC评分高的非透析慢性肾脏病患者的预后。我们利用一个慢性肾脏病代表性队列——韩国慢性肾脏病患者预后队列研究(KNOW-CKD)开发了预测模型。我们将该队列分为训练数据集(70%)和验证数据集(30%)。测试数据集纳入了未包含在KNOW-CKD队列中的患者外部数据集。预测模型中使用了支持向量机、随机森林、XGboost、逻辑回归和多层感知器神经网络模型。我们使用受试者操作特征(AUROC)曲线下面积评估模型性能。应用夏普利加性解释值来选择重要特征。随机森林模型显示出最佳预测性能(AUROC 0.87),并且传统逻辑回归模型与测试数据集之间存在统计学显著差异。本研究将有助于识别年轻慢性肾脏病患者心血管并发症的高危患者,并制定个体化治疗策略。