Shen Kexin, Zhang Shujuan, Ma Shurong, Zhang Haishan
Department of Gastrointestinal Colorectal and Anal Surgery, China-Japan Union Hospital of Jilin University, Changchun, China.
Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China.
J Comput Biol. 2019 Dec;26(12):1367-1378. doi: 10.1089/cmb.2019.0064. Epub 2019 Jul 1.
Biomarkers involved in the progression of Barrett's esophagus (BE) have not been extensively studied. We aimed to identify novel molecular markers for the early diagnosis of BE. The expression profiles of GSE100843 including BE segment and normal squamous mucosa samples before and after vitamin D supplementation were downloaded from Gene Expression Omnibus. Differentially expressed genes (DEGs) were identified using the limma package. Principal component analysis was performed using Minitab, and DEGs in the top three principal components were clustered into different gene sets by the mclust package. Pathways and functions enriched by these gene sets were evaluated by deregulation score analysis. Key genes associated with BE were identified by coexpression analysis and a genetic algorithm. Using the xgboost package, an XGBoost classifier specific for BE was further constructed based on the key genes. A total of 2598 DEGs were identified, which were further clustered into nine gene sets. According to the deregulation scores of pathways and functions enriched by these gene sets, nine functional and pathway terms were significantly deregulated in BE. Among the DEGs, , , and were genes with high fitness levels and connectivity degrees, predicting that they were key genes associated with BE. The XGBoost classifier constructed using the key genes was efficient and robust in BE prediction. The accuracies for prediction were 93% and 87% for training and validation datasets, respectively. Key genes may serve as novel biomarkers of BE, and the XGBoost classifier may contribute to the diagnosis of BE in future clinical practice.
参与巴雷特食管(BE)进展的生物标志物尚未得到广泛研究。我们旨在鉴定用于BE早期诊断的新型分子标志物。从基因表达综合数据库下载了包括BE段以及维生素D补充前后的正常鳞状黏膜样本的GSE100843表达谱。使用limma软件包鉴定差异表达基因(DEG)。使用Minitab进行主成分分析,并通过mclust软件包将前三个主成分中的DEG聚类为不同的基因集。通过失调评分分析评估这些基因集富集的通路和功能。通过共表达分析和遗传算法鉴定与BE相关的关键基因。使用xgboost软件包,基于关键基因进一步构建了针对BE的XGBoost分类器。共鉴定出2598个DEG,并将其进一步聚类为九个基因集。根据这些基因集富集的通路和功能的失调评分,BE中有九个功能和通路术语显著失调。在DEG中, 、 和 是具有高适应度和连接度的基因,预测它们是与BE相关的关键基因。使用关键基因构建的XGBoost分类器在BE预测中高效且稳健。训练和验证数据集的预测准确率分别为93%和87%。关键基因可能作为BE的新型生物标志物,并且XGBoost分类器可能在未来临床实践中有助于BE的诊断。