Department of General Surgery, The First People's Hospital of Fuyang, Hangzhou, China.
Front Endocrinol (Lausanne). 2020 Aug 6;11:510. doi: 10.3389/fendo.2020.00510. eCollection 2020.
We aimed to screen the genes associated with thyroid cancer (THCA) prognosis, and construct a poly-gene risk prediction model for prognosis prediction and improvement. The HTSeq-Counts data of THCA were accessed from TCGA database, including 505 cancer samples and 57 normal tissue samples. "edgeR" package was utilized to perform differential analysis, and weighted gene co-expression network analysis (WGCNA) was applied to screen the differential co-expression genes associated with THCA tissue types. Univariant Cox regression analysis was further used for the selection of survival-related genes. Then, LASSO regression model was constructed to analyze the genes, and an optimal prognostic model was developed as well as evaluated by Kaplan-Meier and ROC curves. Three thousand two hundred seven differentially expressed genes (DEGs) were obtained by differential analysis and 23 co-expression genes (|COR| > 0.5, < 0.05) were gained after WGCNA analysis. In addition, eight genes significantly related to THCA survival were screened by univariant Cox regression analysis, and an optimal prognostic 3-gene risk prediction model was constructed after genes were analyzed by the LASSO regression model. Based on this model, patients were grouped into the high-risk group and low-risk group. Kaplan-Meier curve showed that patients in the low-risk group had much better survival than those in the high-risk group. Moreover, great accuracy of the 3-gene model was revealed by ROC curve and the remarkable correlation between the model and patients' prognosis was verified using the multivariant Cox regression analysis. The prognostic 3-gene model composed by , and three genes can be used as an independent prognostic factor and has better prediction for the survival of THCA patients.
我们旨在筛选与甲状腺癌(THCA)预后相关的基因,并构建一个多基因风险预测模型,用于预后预测和改善。从 TCGA 数据库中获取了 THCA 的 HTSeq-Counts 数据,包括 505 个癌症样本和 57 个正常组织样本。使用“edgeR”软件包进行差异分析,应用加权基因共表达网络分析(WGCNA)筛选与 THCA 组织类型相关的差异共表达基因。进一步进行单变量 Cox 回归分析以选择与生存相关的基因。然后,构建 LASSO 回归模型分析基因,并通过 Kaplan-Meier 和 ROC 曲线评估和开发最佳预后模型。通过差异分析获得了 3207 个差异表达基因(DEGs),通过 WGCNA 分析获得了 23 个共表达基因(|COR|>0.5,<0.05)。此外,通过单变量 Cox 回归分析筛选出 8 个与 THCA 生存显著相关的基因,并通过 LASSO 回归模型分析基因后构建了最佳预后的 3 基因风险预测模型。基于该模型,将患者分为高危组和低危组。Kaplan-Meier 曲线显示,低危组患者的生存明显优于高危组患者。此外,ROC 曲线显示该 3 基因模型具有较高的准确性,并通过多变量 Cox 回归分析验证了该模型与患者预后的显著相关性。由、和三个基因组成的预后 3 基因模型可作为独立的预后因素,对 THCA 患者的生存具有更好的预测作用。