Department of Biomedical Engineering, Tianjin Key Lab of BME Measurement, Tianjin University, Tianjin, PR China.
Department of Radiation Oncology, Tianjin Medical University Cancer Institute & Hospital, Tianjin, PR China.
Cancer Gene Ther. 2020 Sep;27(9):715-725. doi: 10.1038/s41417-019-0143-5. Epub 2019 Oct 23.
Triple-negative breast cancer (TNBC), colon adenocarcinoma (COAD), ovarian cancer (OV), and glioblastoma multiforme (GBM) are common malignant tumors, in which significant challenges are still faced in early diagnosis, treatment, and prognosis. Therefore, further identification of genes related to those malignant tumors is of great significance for the improvement of management of the diseases. The database of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) repository was used as the data source of gene expression profiles in this study. Malignant tumors genes were selected using a feature selection algorithm of maximal relevance and minimal redundancy (mRMR) and the protein-protein interaction (PPI) network. And finally selected 20 genes as potential related genes. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were performed on the potential related genes, and different tumor-specific genes and similarities and differences between network modules and pathways were analyzed. Further, using the potential cancer-related genes found above in this study as features, a support vector machine (SVM) model was developed to predict high-risk malignant tumors. As a result, the prediction accuracy reached more than 85%, indicating that such a model can effectively predict the four types of malignant tumors. It is demonstrated that such genes found above in this study indeed play important roles in the differentiation of the four types of malignant tumors, providing basis for future experimental biological validation and shedding some light on the understanding of new molecular mechanisms related to the four types of tumors.
三阴性乳腺癌(TNBC)、结肠腺癌(COAD)、卵巢癌(OV)和胶质母细胞瘤(GBM)是常见的恶性肿瘤,在早期诊断、治疗和预后方面仍然面临重大挑战。因此,进一步鉴定与这些恶性肿瘤相关的基因对于改善疾病的管理具有重要意义。本研究使用美国国立生物技术信息中心(NCBI)基因表达综合数据库(GEO)数据库作为基因表达谱的数据源。使用最大相关性和最小冗余(mRMR)特征选择算法和蛋白质-蛋白质相互作用(PPI)网络选择恶性肿瘤基因。最后选择了 20 个基因作为潜在的相关基因。对潜在相关基因进行基因本体论(GO)富集和京都基因与基因组百科全书(KEGG)富集分析,并分析了不同肿瘤特异性基因以及网络模块和途径之间的相似性和差异。此外,使用本研究中发现的潜在癌症相关基因作为特征,开发了支持向量机(SVM)模型来预测高危恶性肿瘤。结果表明,该预测模型的准确率超过 85%,表明该模型可以有效地预测这四种恶性肿瘤。这表明,本研究中发现的这些基因确实在四种恶性肿瘤的分化中发挥重要作用,为未来的实验生物学验证提供了依据,并为理解与这四种肿瘤相关的新分子机制提供了一些线索。