Guo Wenbo, Gou Xun, Yu Lei, Zhang Qi, Yang Ping, Pang Minghui, Pang Xinping, Pang Chaoyang, Wei Yanyu, Zhang XiaoYu
College of Computer Science, Sichuan Normal University, Chengdu, China.
College of Life Science, Sichuan Normal University, Chengdu, China.
Front Neurol. 2023 Mar 28;14:1129470. doi: 10.3389/fneur.2023.1129470. eCollection 2023.
Alzheimer's disease (AD) is a neurodegenerative disease that primarily occurs in elderly individuals with cognitive impairment. Although extracellular β-amyloid (Aβ) accumulation and tau protein hyperphosphorylation are considered to be leading causes of AD, the molecular mechanism of AD remains unknown. Therefore, in this study, we aimed to explore potential biomarkers of AD. Next-generation sequencing (NGS) datasets, GSE173955 and GSE203206, were collected from the Gene Expression Omnibus (GEO) database. Analysis of differentially expressed genes (DEGs), gene ontology (GO) functional enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment, and protein-protein networks were performed to identify genes that are potentially associated with AD. Analysis of the DEG based protein-protein interaction (PPI) network using Cytoscape indicated that neuroinflammation and T-cell antigen receptor (TCR)-associated genes (, and ) were the top three hub genes. Next, we validated these three hub genes in the AD database and utilized two machine learning models from different AD datasets (GSE15222) to observe their general relationship with AD. Analysis using the random forest classifier indicated that accuracy (78%) observed using the top three genes as inputs differed only slightly from that (84%) observed using all genes as inputs. Furthermore, another data set, GSE97760, which was analyzed using our novel eigenvalue decomposition method, indicated that the top three hub genes may be involved in tauopathies associated with AD, rather than Aβ pathology. In addition, protein-protein docking simulation revealed that the top hub genes could form stable binding sites with acetylcholinesterase (ACHE). This suggests a potential interaction between hub genes and ACHE, which plays an essential role in the development of anti-AD drug design. Overall, the findings of this study, which systematically analyzed several AD datasets, illustrated that LCK, ZAP70, and CD44 may be used as AD biomarkers. We also established a robust prediction model for classifying patients with AD.
阿尔茨海默病(AD)是一种主要发生在有认知障碍的老年人中的神经退行性疾病。尽管细胞外β淀粉样蛋白(Aβ)积累和tau蛋白过度磷酸化被认为是AD的主要病因,但AD的分子机制仍然未知。因此,在本研究中,我们旨在探索AD的潜在生物标志物。从基因表达综合数据库(GEO)中收集了下一代测序(NGS)数据集GSE173955和GSE203206。进行差异表达基因(DEG)分析、基因本体(GO)功能富集分析、京都基因与基因组百科全书(KEGG)通路富集分析以及蛋白质-蛋白质网络分析,以鉴定可能与AD相关的基因。使用Cytoscape对基于DEG的蛋白质-蛋白质相互作用(PPI)网络进行分析表明,神经炎症和T细胞抗原受体(TCR)相关基因( 、 和 )是前三个枢纽基因。接下来,我们在AD数据库中验证了这三个枢纽基因,并利用来自不同AD数据集(GSE15222)的两个机器学习模型来观察它们与AD的总体关系。使用随机森林分类器进行分析表明,将前三个基因作为输入观察到的准确率(78%)与将所有基因作为输入观察到的准确率(84%)仅有轻微差异。此外,另一个数据集GSE97760,使用我们新的特征值分解方法进行分析,表明前三个枢纽基因可能参与与AD相关的tau蛋白病,而不是Aβ病理。此外,蛋白质-蛋白质对接模拟显示,顶级枢纽基因可与乙酰胆碱酯酶(ACHE)形成稳定的结合位点。这表明枢纽基因与ACHE之间可能存在潜在相互作用,而ACHE在抗AD药物设计的发展中起着至关重要的作用。总体而言,本研究系统分析了多个AD数据集的结果表明,LCK、ZAP70和CD44可用作AD生物标志物。我们还建立了一个用于对AD患者进行分类的强大预测模型。