Xia Wen-Tao, Qiu Wang-Ren, Yu Wang-Ke, Xu Zhao-Chun, Zhang Shou-Hua
School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
Department of General Surgery, Jiangxi Provincial Children's Hospital, Nanchang, China.
Heliyon. 2023 Apr 7;9(4):e15096. doi: 10.1016/j.heliyon.2023.e15096. eCollection 2023 Apr.
The mortality rate from cervical cancer (CESC), a malignant tumor that affects women, has increased significantly globally in recent years. The discovery of biomarkers points to a direction for the diagnosis of cervical cancer with the advancement of bioinformatics technology. The goal of this study was to look for potential biomarkers for the diagnosis and prognosis of CESC using the GEO and TCGA databases. Because of the high dimension and small sample size of the omic data, or the use of biomarkers generated from a single omic data, the diagnosis of cervical cancer may be inaccurate and unreliable. The purpose of this study was to search the GEO and TCGA databases for potential biomarkers for the diagnosis and prognosis of CESC. We begin by downloading CESC (GSE30760) DNA methylation data from GEO, then perform differential analysis on the downloaded methylation data and screen out the differential genes. Then, using estimation algorithms, we score immune cells and stromal cells in the tumor microenvironment and perform survival analysis on the gene expression profile data and the most recent clinical data of CESC from TCGA. Then, using the 'limma' package and Venn plot in R language to perform differential analysis of genes and screen out overlapping genes, these overlapping genes were then subjected to GO and KEGG functional enrichment analysis. The differential genes screened by the GEO methylation data and the differential genes screened by the TCGA gene expression data were intersected to screen out the common differential genes. A protein-protein interaction (PPI) network of gene expression data was then created in order to discover important genes. The PPI network's key genes were crossed with previously identified common differential genes to further validate them. The Kaplan-Meier curve was then used to determine the prognostic importance of the key genes. Survival analysis has shown that CD3E and CD80 are important for the identification of cervical cancer and can be considered as potential biomarkers for cervical cancer.
宫颈癌(CESC)是一种影响女性的恶性肿瘤,近年来全球死亡率显著上升。随着生物信息学技术的进步,生物标志物的发现为宫颈癌的诊断指明了方向。本研究的目的是利用GEO和TCGA数据库寻找CESC诊断和预后的潜在生物标志物。由于组学数据的高维度和小样本量,或者使用从单一组学数据生成的生物标志物,宫颈癌的诊断可能不准确且不可靠。本研究的目的是在GEO和TCGA数据库中搜索CESC诊断和预后的潜在生物标志物。我们首先从GEO下载CESC(GSE30760)DNA甲基化数据,然后对下载的甲基化数据进行差异分析并筛选出差异基因。然后,使用估计算法对肿瘤微环境中的免疫细胞和基质细胞进行评分,并对来自TCGA的CESC基因表达谱数据和最新临床数据进行生存分析。然后,使用R语言中的“limma”包和维恩图对基因进行差异分析并筛选出重叠基因,接着对这些重叠基因进行GO和KEGG功能富集分析。将通过GEO甲基化数据筛选出的差异基因与通过TCGA基因表达数据筛选出的差异基因进行交集分析,以筛选出共同的差异基因。随后创建基因表达数据的蛋白质-蛋白质相互作用(PPI)网络,以发现重要基因。将PPI网络的关键基因与先前确定的共同差异基因进行交叉验证。然后使用Kaplan-Meier曲线确定关键基因的预后重要性。生存分析表明,CD3E和CD80对宫颈癌的识别很重要,可被视为宫颈癌的潜在生物标志物。