Teng Jieying, Deng Guoxiong
Department of Cardiology, The Fifth Affiliated Hospital of Guangxi Medical University, Nanning, China.
Department of Cardiology, The First People's Hospital of Nanning, Nanning, China.
Front Cardiovasc Med. 2025 Feb 26;12:1521722. doi: 10.3389/fcvm.2025.1521722. eCollection 2025.
This study is based on bioinformatics analysis to explore the co-expressed differentially expressed genes (DEGs) between atrial fibrillation (AF) and chronic kidney disease (CKD), identify the biomarkers for the occurrence and development of the two diseases, investigate the potential connections between AF and CKD, and explore the associations with immune cells.
We downloaded Two AF gene chip datasets (GSE79768, GSE14975) and two CKD gene chip datasets (GSE37171, GSE120683) from the GEO database. After pre-processing and standardizing the datasets, two DEGs datasets were obtained. The DEGs were screened using R language, and the intersection was taken through Venn diagrams to obtain the co-expressed DEGs of AF and CKD. To obtain the signal pathways where the co-expressed DEGs were significantly enriched, GO/KEGG enrichment analyses were used to analysis the co-expressed DEGs. The Cytoscape software was used to further construct a PPI network and screen key characteristic genes, and the top 15 co-expressed DEGs were screened through the topological algorithm MCC. To further screen key characteristic genes, two machine-learning algorithms, LASSO regression and RF algorithm, were performed to screen key characteristic genes for the two disease datasets respectively to determine the diagnostic values of the characteristic genes in the two diseases. The GeneMANIA online database and Networkanalyst platform were used to construct gene-gene and TFs-gene interaction network diagrams respectively to predict gene functions and find key transcription factors. Finally, the correlation between key genes and immune cell subtypes was performed by Spearman analysis.
A total of 425 DEGs were screened out from the AF dataset, and 4,128 DEGs were screened out from the CKD dataset. After taking the intersection of the two, 82 co-expressed DEGs were obtained. The results of GO enrichment analysis of DEGs showed that the genes were mainly enriched in biological processes such as secretory granule lumen, blood microparticles, complement binding, and antigen binding. KEGG functional enrichment analysis indicated that the genes were mainly enriched in pathways such as the complement coagulation cascade, systemic lupus erythematosus, and Staphylococcus aureus infection. The top 15 DEGs were obtained through the MCC topological algorithm of Cytoscape software. Subsequently, based on LASSO regression and RF algorithm, the key characteristic genes of the 15 co-expressed DEGs of AF and CKD were further screened, and by taking the intersection through Venn diagrams, five key characteristic genes were finally obtained: PPBP, CXCL1, LRRK2, RGS18, RSAD2. ROC curves were constructed to calculate the area under the curve to verify the diagnostic efficacy of the key characteristic genes for diseases. The results showed that RSAD2 had the highest diagnostic value for AF, and the diagnostic values of PPBP, CXCL1, and RSAD2 for CKD were all at a relatively strong verification level. Based on AUC >0.7, co-expressed key genes with strong diagnostic efficacy were obtained: PPBP, CXCL1, RSAD2. The results of the GeneMANIA online database showed that the two biomarkers, BBPB and CXCL1, mainly had functional interactions with cytokine activity, chemokine receptor activity, cell response to chemokines, neutrophil migration, response to chemokines, granulocyte chemotaxis, and granulocyte migration. The TFs-gene regulatory network identified FOXC1, FOXL1, and GATA2 as the main transcription factors of the key characteristic genes. Finally, through immune infiltration analysis, the results indicated that there were various immune cell infiltrations in the development processes of AF and CKD.
PPBP, CXCL1, and RSAD2 are key genes closely related to the occurrence and development processes between AF and CKD. Among them, the CXCLs/CXCR signaling pathway play a crucial role in the development processes of the two diseases likely. In addition, FOXC1, FOXL1, and GATA2 may be potential therapeutic targets for AF combined with CKD, and the development of the diseases is closely related to immune cell infiltration.
本研究基于生物信息学分析,探索心房颤动(AF)与慢性肾脏病(CKD)之间的共表达差异表达基因(DEG),确定两种疾病发生发展的生物标志物,研究AF与CKD之间的潜在联系,并探索与免疫细胞的关联。
我们从GEO数据库下载了两个AF基因芯片数据集(GSE79768、GSE14975)和两个CKD基因芯片数据集(GSE37171、GSE120683)。对数据集进行预处理和标准化后,获得了两个DEG数据集。使用R语言筛选DEG,并通过维恩图取交集,以获得AF和CKD的共表达DEG。为了获得共表达DEG显著富集的信号通路,使用GO/KEGG富集分析对共表达DEG进行分析。使用Cytoscape软件进一步构建蛋白质-蛋白质相互作用(PPI)网络并筛选关键特征基因,并通过拓扑算法MCC筛选前15个共表达DEG。为了进一步筛选关键特征基因,分别对两个疾病数据集进行了两种机器学习算法,即套索回归(LASSO回归)和随机森林(RF)算法,以筛选关键特征基因,确定特征基因在两种疾病中的诊断价值。使用GeneMANIA在线数据库和Networkanalyst平台分别构建基因-基因和转录因子-基因相互作用网络图,以预测基因功能并找到关键转录因子。最后,通过Spearman分析进行关键基因与免疫细胞亚型之间的相关性分析。
从AF数据集中筛选出425个DEG,从CKD数据集中筛选出4128个DEG。两者取交集后,获得了82个共表达DEG。DEG的GO富集分析结果表明,这些基因主要富集在分泌颗粒腔、血液微颗粒、补体结合和抗原结合等生物学过程中。KEGG功能富集分析表明,这些基因主要富集在补体凝血级联、系统性红斑狼疮和金黄色葡萄球菌感染等通路中。通过Cytoscape软件的MCC拓扑算法获得了前15个DEG。随后,基于LASSO回归和RF算法,进一步筛选了AF和CKD的15个共表达DEG的关键特征基因,并通过维恩图取交集,最终获得了五个关键特征基因:血小板碱性蛋白(PPBP)、趋化因子配体1(CXCL1)、富含亮氨酸重复激酶2(LRRK2)、RGS蛋白家族成员18(RGS18)、干扰素诱导的抗病毒蛋白(RSAD2)。构建ROC曲线以计算曲线下面积,以验证关键特征基因对疾病的诊断效能。结果表明,RSAD2对AF具有最高的诊断价值,PPBP、CXCL1和RSAD2对CKD的诊断价值均处于相对较强的验证水平。基于曲线下面积(AUC)>0.7,获得了具有强诊断效能的共表达关键基因:PPBP、CXCL1、RSAD2。GeneMANIA在线数据库的结果表明,两个生物标志物,即PPBP和CXCL1,主要与细胞因子活性、趋化因子受体活性、细胞对趋化因子的反应、中性粒细胞迁移、对趋化因子的反应、粒细胞趋化性和粒细胞迁移具有功能相互作用。转录因子-基因调控网络确定叉头框蛋白C1(FOXC1)、叉头框蛋白L1(FOXL1)和GATA结合蛋白2(GATA2)为关键特征基因的主要转录因子。最后,通过免疫浸润分析,结果表明在AF和CKD的发展过程中存在各种免疫细胞浸润。
PPBP、CXCL1和RSAD2是与AF和CKD发生发展过程密切相关的关键基因。其中,趋化因子(CXCLs)/趋化因子受体(CXCR)信号通路可能在两种疾病的发展过程中起关键作用。此外,FOXC1、FOXL1和GATA2可能是AF合并CKD的潜在治疗靶点,且疾病的发展与免疫细胞浸润密切相关。