Hubei University of Chinese Medicine, Wuhan, Hubei Province, China.
Department of Gastroenterology, Wuhan No.1 Hospital, Hubei Province, China.
Autoimmunity. 2024 Dec;57(1):2422352. doi: 10.1080/08916934.2024.2422352. Epub 2024 Oct 30.
Crohn's disease (CD) presents significant diagnostic and therapeutic challenges due to its unclear etiology, frequent relapses, and limited treatment options. Traditional monitoring often relies on invasive and costly gastrointestinal procedures. This study aimed to identify specific diagnostic markers for CD using advanced computational approaches. Four gene expression datasets from the Gene Expression Omnibus (GEO) were analyzed, identifying differentially expressed genes (DEGs) through gene set enrichment analysis in R. Key biomarkers were selected using machine learning algorithms, including LASSO logistic regression, SVM‑RFE, and Random Forest, and their accuracy was assessed using receiver operating characteristic (ROC) curves and nomogram models. Immune cell infiltration was analyzed using the CIBERSORT algorithm, which helped reveal associations between diagnostic markers and immune cell patterns in CD. From a training set of 605 CD samples and 82 normal controls, we identified eight significant biomarkers: LCN2, FOLH1, CXCL1, FPR1, S100P, IGFBP5, CHP2, and AQP9. The diagnostic model showed high predictive power (AUC=0.954) and performed well in external validation (AUC = 1). Immune cell infiltration analysis highlighted various immune cells involved in CD, with all diagnostic markers strongly linked to immune cell interactions. Our findings propose candidate hub genes and present a nomogram for CD diagnosis, providing potential diagnostic biomarkers for clinical applications in CD.
克罗恩病(CD)的病因不明、频繁复发且治疗选择有限,因此在诊断和治疗方面存在重大挑战。传统的监测方法通常依赖于侵入性和昂贵的胃肠程序。本研究旨在使用先进的计算方法来确定 CD 的特定诊断标志物。通过在 R 中进行基因集富集分析,对来自基因表达综合数据库(GEO)的四个基因表达数据集进行了分析,确定了差异表达基因(DEGs)。使用机器学习算法(包括 LASSO 逻辑回归、SVM-RFE 和随机森林)选择关键生物标志物,并使用接收器操作特征(ROC)曲线和列线图模型评估其准确性。使用 CIBERSORT 算法分析免疫细胞浸润,这有助于揭示 CD 中诊断标志物与免疫细胞模式之间的关联。从 605 例 CD 样本和 82 例正常对照的训练集中,我们确定了 8 个显著的生物标志物:LCN2、FOLH1、CXCL1、FPR1、S100P、IGFBP5、CHP2 和 AQP9。该诊断模型显示出较高的预测能力(AUC=0.954),在外部验证中表现良好(AUC=1)。免疫细胞浸润分析突出了涉及 CD 的各种免疫细胞,所有诊断标志物都与免疫细胞相互作用密切相关。我们的研究结果提出了候选枢纽基因,并提出了 CD 诊断的列线图,为 CD 的临床应用提供了潜在的诊断生物标志物。