Yin Yu, Chen Congcong, Zhang Dong, Han Qianguang, Wang Zijie, Huang Zhengkai, Chen Hao, Sun Li, Fei Shuang, Tao Jun, Han Zhijian, Tan Ruoyun, Gu Min, Ju Xiaobing
Department of Urology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
Department of Urology, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China.
Front Genet. 2023 Nov 1;14:1276963. doi: 10.3389/fgene.2023.1276963. eCollection 2023.
Interstitial fibrosis and tubular atrophy (IFTA) are the histopathological manifestations of chronic kidney disease (CKD) and one of the causes of long-term renal loss in transplanted kidneys. Necroptosis as a type of programmed death plays an important role in the development of IFTA, and in the late functional decline and even loss of grafts. In this study, 13 machine learning algorithms were used to construct IFTA diagnostic models based on necroptosis-related genes. We screened all 162 "kidney transplant"-related cohorts in the GEO database and obtained five data sets (training sets: GSE98320 and GSE76882, validation sets: GSE22459 and GSE53605, and survival set: GSE21374). The training set was constructed after removing batch effects of GSE98320 and GSE76882 by using the SVA package. The differentially expressed gene (DEG) analysis was used to identify necroptosis-related DEGs. A total of 13 machine learning algorithms-LASSO, Ridge, Enet, Stepglm, SVM, glmboost, LDA, plsRglm, random forest, GBM, XGBoost, Naive Bayes, and ANNs-were used to construct 114 IFTA diagnostic models, and the optimal models were screened by the AUC values. Post-transplantation patients were then grouped using consensus clustering, and the different subgroups were further explored using PCA, Kaplan-Meier (KM) survival analysis, functional enrichment analysis, CIBERSOFT, and single-sample Gene Set Enrichment Analysis. A total of 55 necroptosis-related DEGs were identified by taking the intersection of the DEGs and necroptosis-related gene sets. Stepglm[both]+RF is the optimal model with an average AUC of 0.822. A total of four molecular subgroups of renal transplantation patients were obtained by clustering, and significant upregulation of fibrosis-related pathways and upregulation of immune response-related pathways were found in the C4 group, which had poor prognosis. Based on the combination of the 13 machine learning algorithms, we developed 114 IFTA classification models. Furthermore, we tested the top model using two independent data sets from GEO.
间质纤维化和肾小管萎缩(IFTA)是慢性肾脏病(CKD)的组织病理学表现,也是移植肾长期肾功能丧失的原因之一。坏死性凋亡作为一种程序性死亡,在IFTA的发生发展以及移植肾后期功能衰退甚至丧失中起重要作用。在本研究中,使用13种机器学习算法基于坏死性凋亡相关基因构建IFTA诊断模型。我们在基因表达综合数据库(GEO)中筛选了所有162个与“肾移植”相关的队列,并获得了五个数据集(训练集:GSE98320和GSE76882,验证集:GSE22459和GSE53605,以及生存集:GSE21374)。通过使用SVA软件包去除GSE98320和GSE76882的批次效应后构建训练集。采用差异表达基因(DEG)分析来鉴定坏死性凋亡相关的差异表达基因。总共使用13种机器学习算法——套索回归(LASSO)、岭回归(Ridge)、弹性网络(Enet)、逐步广义线性模型(Stepglm)、支持向量机(SVM)、广义线性模型增强(glmboost)、线性判别分析(LDA)、偏最小二乘回归广义线性模型(plsRglm)、随机森林(random forest)、梯度提升机(GBM)、极端梯度提升(XGBoost)、朴素贝叶斯(Naive Bayes)和人工神经网络(ANNs)——构建了114个IFTA诊断模型,并通过曲线下面积(AUC)值筛选出最佳模型。然后使用一致性聚类对移植后患者进行分组,并使用主成分分析(PCA)、卡普兰 - 迈耶(KM)生存分析、功能富集分析、CIBERSOFT和单样本基因集富集分析进一步探索不同亚组。通过取差异表达基因与坏死性凋亡相关基因集的交集,共鉴定出55个坏死性凋亡相关的差异表达基因。Stepglm[两者] + 随机森林(RF)是最佳模型,平均AUC为0.822。通过聚类获得了总共四个肾移植患者的分子亚组,发现预后较差的C4组中纤维化相关通路显著上调以及免疫反应相关通路上调。基于这13种机器学习算法的组合,我们开发了114个IFTA分类模型。此外,我们使用来自GEO的两个独立数据集对顶级模型进行了测试。
J Am Soc Nephrol. 2022-3
Front Immunol. 2021