School of Electronics and Information, Northwestern Polytechnical University, 1 Dongxiang Road, Xi'an, 710129, Shaanxi, China.
Xi'an Key Laboratory of Stem Cell and Regenerative Medicine, Institute of Medical Research, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, Shaanxi, China.
Sci Rep. 2022 May 24;12(1):8761. doi: 10.1038/s41598-022-12780-7.
The combination of TCGA and GTEx databases will provide more comprehensive information for characterizing the human genome in health and disease, especially for underlying the cancer genetic alterations. Here we analyzed the gene expression profile of COAD in both tumor samples from TCGA and normal colon tissues from GTEx. Using the SNR-PPFS feature selection algorithms, we discovered a 38 gene signatures that performed well in distinguishing COAD tumors from normal samples. Bayesian network of the 38 genes revealed that DEGs with similar expression patterns or functions interacted more closely. We identified 14 up-DEGs that were significantly correlated with tumor stages. Cox regression analysis demonstrated that tumor stage, STMN4 and FAM135B dysregulation were independent prognostic factors for COAD survival outcomes. Overall, this study indicates that using feature selection approaches to select key gene signatures from high-dimensional datasets can be an effective way for studying cancer genomic characteristics.
TCGA 和 GTEx 数据库的结合将为人类基因组在健康和疾病中的特征提供更全面的信息,特别是为癌症遗传改变提供基础。在这里,我们分析了 TCGA 肿瘤样本和 GTEx 正常结肠组织中 COAD 的基因表达谱。使用 SNR-PPFS 特征选择算法,我们发现了 38 个基因特征,这些特征在区分 COAD 肿瘤和正常样本方面表现良好。38 个基因的贝叶斯网络表明,具有相似表达模式或功能的差异表达基因相互作用更紧密。我们鉴定了 14 个上调的差异表达基因,这些基因与肿瘤分期显著相关。Cox 回归分析表明,肿瘤分期、STMN4 和 FAM135B 失调是 COAD 生存结果的独立预后因素。总的来说,这项研究表明,使用特征选择方法从高维数据集选择关键基因特征可能是研究癌症基因组特征的有效方法。