Xiao Jie, Liang Junyao, Zhou Tao, Zhou Man, Zhang Dexu, Feng Hui, Tang Chusen, Zhou Qian, Yang Weiqing, Tan Xiaoqin, Zhang Wanjia, Xu Yin
First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, 410007, Hunan, China.
Sci Rep. 2024 Dec 30;14(1):31736. doi: 10.1038/s41598-024-82319-5.
Crohn's disease (CD) is a chronic inflammatory bowel condition, and colon adenocarcinoma (COAD), as one of the most prevalent malignant tumors of the digestive tract, has been indicated by research to have a close association with CD. This study employs bioinformatics techniques to uncover the potential molecular links between CD and COAD. In this study, two data series related to CD were identified from the Gene Expression Omnibus (GEO) database under specific criteria, and relevant COAD gene data were obtained from The Cancer Genome Atlas (TCGA). Weighted Gene Co-expression Network Analysis (WGCNA), differentially expressed genes (DEGs), and protein-protein interaction (PPI) network analysis were conducted. A diagnostic model was established using machine learning. The accuracy of the diagnosis was validated using methods such as the construction of Receiver Operating Characteristic (ROC) curves and nomograms. Gene Set Enrichment Analysis (GSEA) was also employed to enrich the relevant pathways and biological processes. This study identified three genes through machine learning selection: DPEP1, MMP3, and MMP13. The ROC curves demonstrated that the machine learning model constructed with these three genes has a high level of accuracy, confirming their potential as biomarkers. Furthermore, GSEA elucidated that the pathways associated with these three key genes are closely related to cytokines and other factors. This study has identified key biomarker genes for CD and COAD: DPEP1, MMP3, and MMP13, providing additional molecular mechanism associations between the two diseases. It also offers more connections and pathways for reference regarding the progression of CD to COAD.
克罗恩病(CD)是一种慢性炎症性肠病,而结肠癌(COAD)作为消化道最常见的恶性肿瘤之一,研究表明其与CD密切相关。本研究采用生物信息学技术揭示CD与COAD之间潜在的分子联系。在本研究中,根据特定标准从基因表达综合数据库(GEO)中鉴定出两个与CD相关的数据系列,并从癌症基因组图谱(TCGA)中获取相关的COAD基因数据。进行了加权基因共表达网络分析(WGCNA)、差异表达基因(DEG)分析和蛋白质-蛋白质相互作用(PPI)网络分析。使用机器学习建立了诊断模型。通过构建受试者工作特征(ROC)曲线和列线图等方法验证了诊断的准确性。还采用基因集富集分析(GSEA)来富集相关途径和生物学过程。本研究通过机器学习筛选确定了三个基因:DPEP1、MMP3和MMP13。ROC曲线表明,用这三个基因构建的机器学习模型具有较高的准确性,证实了它们作为生物标志物的潜力。此外,GSEA阐明与这三个关键基因相关的途径与细胞因子和其他因素密切相关。本研究确定了CD和COAD的关键生物标志物基因:DPEP1、MMP3和MMP13,为这两种疾病之间提供了额外的分子机制关联。它还为CD向COAD进展提供了更多的参考联系和途径。