Yang Longxiu, Qin Yuan, Jian Chongdong
Department of Neurology, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.
Department of Neurology, The Affiliated Hospital of Youjiang Medical University for Nationalities, Baise, China.
Front Cell Dev Biol. 2021 Apr 22;9:668738. doi: 10.3389/fcell.2021.668738. eCollection 2021.
Alzheimer's disease (AD), a nervous system disease, lacks effective therapies at present. RNA expression is the basic way to regulate life activities, and identifying related characteristics in AD patients may aid the exploration of AD pathogenesis and treatment. This study developed a classifier that could accurately classify AD patients and healthy people, and then obtained 3 core genes that may be related to the pathogenesis of AD. To this end, RNA expression data of the middle temporal gyrus of AD patients were firstly downloaded from GEO database, and the data were then normalized using package following a supplementation of missing data by k-Nearest Neighbor (KNN) algorithm. Afterwards, the top 500 genes of the most feature importance were obtained through Max-Relevance and Min-Redundancy (mRMR) analysis, and based on these genes, a series of AD classifiers were constructed through Support Vector Machine (SVM), Random Forest (RF), and KNN algorithms. Then, the KNN classifier with the highest Matthews correlation coefficient (MCC) value composed of 14 genes in incremental feature selection (IFS) analysis was identified as the best AD classifier. As analyzed, the 14 genes played a pivotal role in determination of AD and may be core genes associated with the pathogenesis of AD. Finally, protein-protein interaction (PPI) network and Random Walk with Restart (RWR) analysis were applied to obtain core gene-associated genes, and key pathways related to AD were further analyzed. Overall, this study contributed to a deeper understanding of AD pathogenesis and provided theoretical guidance for related research and experiments.
阿尔茨海默病(AD)是一种神经系统疾病,目前缺乏有效的治疗方法。RNA表达是调节生命活动的基本方式,识别AD患者的相关特征可能有助于探索AD的发病机制和治疗方法。本研究开发了一种能够准确区分AD患者和健康人的分类器,然后获得了3个可能与AD发病机制相关的核心基因。为此,首先从GEO数据库下载AD患者颞中回的RNA表达数据,然后使用该软件包进行数据归一化处理,同时采用k近邻(KNN)算法补充缺失数据。之后,通过最大相关最小冗余(mRMR)分析获得特征重要性最高的前500个基因,并基于这些基因通过支持向量机(SVM)、随机森林(RF)和KNN算法构建了一系列AD分类器。然后,在增量特征选择(IFS)分析中由14个基因组成的马修斯相关系数(MCC)值最高的KNN分类器被确定为最佳AD分类器。经分析,这14个基因在AD的判定中起关键作用,可能是与AD发病机制相关的核心基因。最后,应用蛋白质-蛋白质相互作用(PPI)网络和带重启的随机游走(RWR)分析来获得核心基因相关基因,并进一步分析与AD相关的关键通路。总体而言,本研究有助于更深入地了解AD的发病机制,并为相关研究和实验提供理论指导。