Mu Kai-Lang, Ran Fei, Peng Le-Qiang, Zhou Ling-Li, Wu Yu-Tong, Shao Ming-Hui, Chen Xiang-Gui, Guo Chang-Mao, Luo Qiu-Mei, Wang Tian-Jian, Liu Yu-Chen, Liu Gang
Guizhou University of Traditional Chinese Medicine, Guiyang, 550025, Guizhou, China.
Heliyon. 2024 Aug 5;10(15):e35511. doi: 10.1016/j.heliyon.2024.e35511. eCollection 2024 Aug 15.
Rheumatoid arthritis (RA) is a chronic systemic autoimmune disease characterized by inflammatory cell infiltration, which can lead to chronic disability, joint destruction and loss of function. At present, the pathogenesis of RA is still unclear. The purpose of this study is to explore the potential biomarkers and immune molecular mechanisms of rheumatoid arthritis through machine learning-assisted bioinformatics analysis, in order to provide reference for the early diagnosis and treatment of RA disease.
RA gene chips were screened from the public gene GEO database, and batch correction of different groups of RA gene chips was performed using Strawberry Perl. DEGs were obtained using the limma package of R software, and functional enrichment analysis such as gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), disease ontology (DO), and gene set (GSEA) were performed. Three machine learning methods, least absolute shrinkage and selection operator regression (LASSO), support vector machine recursive feature elimination (SVM-RFE) and random forest tree (Random Forest), were used to identify potential biomarkers of RA. The validation group data set was used to verify and further confirm its expression and diagnostic value. In addition, CIBERSORT algorithm was used to evaluate the infiltration of immune cells in RA and control samples, and the correlation between confirmed RA diagnostic biomarkers and immune cells was analyzed.
Through feature screening, 79 key DEGs were obtained, mainly involving virus response, Parkinson's pathway, dermatitis and cell junction components. A total of 29 hub genes were screened by LASSO regression, 34 hub genes were screened by SVM-RFE, and 39 hub genes were screened by Random Forest. Combined with the three algorithms, a total of 12 hub genes were obtained. Through the expression and diagnostic value verification in the validation group data set, 7 genes that can be used as diagnostic biomarkers for RA were preliminarily confirmed. At the same time, the correlation analysis of immune cells found that γδT cells, CD4 memory activated T cells, activated dendritic cells and other immune cells were positively correlated with multiple RA diagnostic biomarkers, CD4 naive T cells, regulatory T cells and other immune cells were negatively correlated with multiple RA diagnostic biomarkers.
The results of novel characteristic gene analysis of RA showed that , , , , , and had good diagnostic and clinical value for the diagnosis of RA, and were closely related to immune cells. Therefore, these seven DEGs may become new diagnostic markers and immunotherapy markers for RA.
类风湿关节炎(RA)是一种以炎症细胞浸润为特征的慢性全身性自身免疫性疾病,可导致慢性残疾、关节破坏和功能丧失。目前,RA的发病机制仍不清楚。本研究旨在通过机器学习辅助的生物信息学分析探索类风湿关节炎的潜在生物标志物和免疫分子机制,为RA疾病的早期诊断和治疗提供参考。
从公共基因GEO数据库中筛选RA基因芯片,使用Strawberry Perl对不同组的RA基因芯片进行批次校正。使用R软件的limma包获得差异表达基因(DEGs),并进行基因本体论(GO)、京都基因与基因组百科全书(KEGG)、疾病本体论(DO)和基因集富集分析(GSEA)等功能富集分析。使用三种机器学习方法,即最小绝对收缩和选择算子回归(LASSO)、支持向量机递归特征消除(SVM-RFE)和随机森林树(Random Forest)来识别RA的潜在生物标志物。使用验证组数据集验证并进一步确认其表达和诊断价值。此外,使用CIBERSORT算法评估RA和对照样本中免疫细胞的浸润情况,并分析已确认的RA诊断生物标志物与免疫细胞之间的相关性。
通过特征筛选,获得了79个关键DEGs,主要涉及病毒反应、帕金森通路、皮炎和细胞连接成分。通过LASSO回归筛选出29个枢纽基因,通过SVM-RFE筛选出34个枢纽基因,通过随机森林筛选出39个枢纽基因。结合三种算法,共获得12个枢纽基因。通过在验证组数据集中的表达和诊断价值验证,初步确认了7个可作为RA诊断生物标志物的基因。同时,免疫细胞相关性分析发现,γδT细胞、CD4记忆活化T细胞、活化树突状细胞等免疫细胞与多个RA诊断生物标志物呈正相关,CD4初始T细胞、调节性T细胞等免疫细胞与多个RA诊断生物标志物呈负相关。
RA新特征基因分析结果表明, 、 、 、 、 、 和 对RA诊断具有良好的诊断和临床价值,且与免疫细胞密切相关。因此,这7个DEGs可能成为RA新的诊断标志物和免疫治疗标志物。