Wang MaoMeng, Wang Shuang, Lin XinHua, Lv XiaoJing, Liu XueXia, Zhang Hua
Qingdao University Affiliated Yantai Yuhuangding Hospital, Yantai, Shandong Province, China.
Department of Otorhinolaryngology, Head and Neck Surgery, Yantai Yuhuangding Hospital, Qingdao University, Yantai, Shandong, China.
PLoS One. 2025 Sep 3;20(9):e0329549. doi: 10.1371/journal.pone.0329549. eCollection 2025.
This study was designed to identify immune-related biomarkers associated with allergic rhinitis (AR) and construct a robust a diagnostic model. Two datasets (GSE5010 and GSE50223) were downloaded from the NCBI GEO database, containing 38 and 84 blood CD4 + T cell samples, respectively. To eliminate batch effects, the surrogate variable analysis (sva) R package (version 3.38.0) was employed, enabling the integration of data for subsequent analysis. Immune cell infiltration profiles were assessed using the Gene Set Variation Analysis (GSVA) R package (version 1.36.3). A gene co-expression network was constructed via the Weighted Gene Co-Expression Network Analysis (WGCNA) algorithm to identify disease-related modules. Differentially expressed genes (DEGs) were identified using the linear models for microarray data (limma) R package (version 3.34.7), followed by functional enrichment analysis using DAVID. Protein-protein interaction (PPI) networks were constructed based on the STRING database to highlight key genes. A diagnostic model was subsequently developed utilizing the Least Absolute Shrinkage and Selection Operator (LASSO) regression algorithm and Support Vector Machine (SVM) method, with its discriminative capacity assessed via Receiver Operating Characteristic (ROC) curves. A total of twenty-eight immune cell types were analyzed, revealing significant differences in eight types between the AR and control groups. Through WGCNA, three disease-related modules comprising 4278 candidate genes were identified. Differential expression analysis identified 326 significant DEGs, of which 257 overlapped with WGCNA-selected genes. These genes exhibited significant enrichment in immune-related pathways, including "cytokine-cytokine receptor interaction" and "chemokine signaling pathway." Gene Set Enrichment Analysis (GSEA) further uncovered 12 KEGG pathways significantly associated with disease risk scores. Drug screening identified 24 small molecule drugs related to key genes. A diagnostic model incorporating five genes (RFC4, LYN, IL3, TNFRSF1B, and RBBP7) was constructed, demonstrating diagnostic efficiencies of 0.843 and 0.739 in the training and validation sets, respectively. An AR mouse model was successfully established, and the expression levels of relevant genes were validated through RT-qPCR experiments. The five-gene diagnostic model established in this study exhibits strong predictive ability in distinguishing AR patients from healthy controls, with potential clinical applications in diagnosing AR and advancing novel diagnostic and therapeutic strategies.
本研究旨在识别与变应性鼻炎(AR)相关的免疫相关生物标志物,并构建一个可靠的诊断模型。从NCBI GEO数据库下载了两个数据集(GSE5010和GSE50223),分别包含38个和84个血液CD4 + T细胞样本。为消除批次效应,采用了替代变量分析(sva)R包(版本3.38.0),以便整合数据用于后续分析。使用基因集变异分析(GSVA)R包(版本1.36.3)评估免疫细胞浸润谱。通过加权基因共表达网络分析(WGCNA)算法构建基因共表达网络,以识别疾病相关模块。使用微阵列数据的线性模型(limma)R包(版本3.34.7)识别差异表达基因(DEG),随后使用DAVID进行功能富集分析。基于STRING数据库构建蛋白质-蛋白质相互作用(PPI)网络,以突出关键基因。随后利用最小绝对收缩和选择算子(LASSO)回归算法和支持向量机(SVM)方法开发诊断模型,并通过受试者工作特征(ROC)曲线评估其判别能力。共分析了28种免疫细胞类型,发现AR组和对照组之间有8种类型存在显著差异。通过WGCNA,识别出由4278个候选基因组成的3个疾病相关模块。差异表达分析确定了326个显著的DEG,其中257个与WGCNA选择的基因重叠。这些基因在免疫相关途径中表现出显著富集,包括“细胞因子-细胞因子受体相互作用”和“趋化因子信号通路”。基因集富集分析(GSEA)进一步发现12条KEGG途径与疾病风险评分显著相关。药物筛选确定了24种与关键基因相关的小分子药物。构建了一个包含5个基因(RFC4、LYN、IL3、TNFRSF1B和RBBP7)的诊断模型,在训练集和验证集中的诊断效率分别为0.843和0.739。成功建立了AR小鼠模型,并通过RT-qPCR实验验证了相关基因的表达水平。本研究建立的五基因诊断模型在区分AR患者和健康对照方面具有很强的预测能力,在AR诊断及推进新型诊断和治疗策略方面具有潜在的临床应用价值。