Lai Yongxing, Lin Xueyan, Lin Chunjin, Lin Xing, Chen Zhihan, Zhang Li
Department of Geriatric Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou, China.
Fujian Provincial Center for Geriatrics, Fujian Provincial Hospital, Fuzhou, China.
Front Pharmacol. 2022 Aug 19;13:975774. doi: 10.3389/fphar.2022.975774. eCollection 2022.
Alzheimer's disease (AD) is a severe dementia with clinical and pathological heterogeneity. Our study was aim to explore the roles of endoplasmic reticulum (ER) stress-related genes in AD patients based on interpretable machine learning. Microarray datasets were obtained from the Gene Expression Omnibus (GEO) database. We performed nine machine learning algorithms including AdaBoost, Logistic Regression, Light Gradient Boosting (LightGBM), Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Random Forest, K-nearest neighbors (KNN), Naïve Bayes, and support vector machines (SVM) to screen ER stress-related feature genes and estimate their efficiency of these genes for early diagnosis of AD. ROC curves were performed to evaluate model performance. Shapley additive explanation (SHAP) was applied for interpreting the results of these models. AD patients were classified using a consensus clustering algorithm. Immune infiltration and functional enrichment analysis were performed via CIBERSORT and GSVA, respectively. CMap analysis was utilized to identify subtype-specific small-molecule compounds. Higher levels of immune infiltration were found in AD individuals and were markedly linked to deregulated ER stress-related genes. The SVM model exhibited the highest AUC (0.879), accuracy (0.808), recall (0.773), and precision (0.809). Six characteristic genes (RNF5, UBAC2, DNAJC10, RNF103, DDX3X, and NGLY1) were determined, which enable to precisely predict AD progression. The SHAP plots illustrated how a feature gene influence the output of the SVM prediction model. Patients with AD could obtain clinical benefits from the feature gene-based nomogram. Two ER stress-related subtypes were defined in AD, subtype2 exhibited elevated immune infiltration levels and immune score, as well as higher expression of immune checkpoint. We finally identified several subtype-specific small-molecule compounds. Our study provides new insights into the role of ER stress in AD heterogeneity and the development of novel targets for individualized treatment in patients with AD.
阿尔茨海默病(AD)是一种具有临床和病理异质性的严重痴呆症。我们的研究旨在基于可解释的机器学习探索内质网(ER)应激相关基因在AD患者中的作用。从基因表达综合数据库(GEO)获取微阵列数据集。我们执行了九种机器学习算法,包括自适应增强(AdaBoost)、逻辑回归、轻量级梯度提升(LightGBM)、决策树(DT)、极端梯度提升(XGBoost)、随机森林、K近邻(KNN)、朴素贝叶斯和支持向量机(SVM),以筛选ER应激相关特征基因,并评估这些基因对AD早期诊断的有效性。绘制ROC曲线以评估模型性能。使用Shapley值相加解释(SHAP)来解释这些模型的结果。使用一致性聚类算法对AD患者进行分类。分别通过CIBERSORT和基因集变异分析(GSVA)进行免疫浸润和功能富集分析。利用连通图(CMap)分析来鉴定亚型特异性小分子化合物。在AD个体中发现了更高水平的免疫浸润,并且与失调的ER应激相关基因明显相关。SVM模型表现出最高的曲线下面积(AUC)(0.879)、准确率(0.808)、召回率(0.773)和精确率(0.809)。确定了六个特征基因(RNF5、UBAC2、DNAJC10、RNF103、DDX3X和NGLY1),它们能够精确预测AD的进展。SHAP图说明了一个特征基因如何影响SVM预测模型的输出。AD患者可以从基于特征基因的列线图中获得临床益处。在AD中定义了两种ER应激相关亚型,亚型2表现出升高的免疫浸润水平和免疫评分,以及免疫检查点的更高表达。我们最终鉴定了几种亚型特异性小分子化合物。我们的研究为ER应激在AD异质性中的作用以及AD患者个体化治疗新靶点的开发提供了新的见解。