Xun Dexu, Li Xue, Huang Lan, Zhao Yuanchun, Chen Jiajia, Qi Xin
School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China.
J Gastrointest Oncol. 2024 Oct 31;15(5):2100-2116. doi: 10.21037/jgo-24-325. Epub 2024 Oct 24.
Colorectal cancer (CRC) is a common intestinal malignancy worldwide, posing a serious threat to public health. Due to its high heterogeneity, prognosis and drug response of different CRC patients vary widely, limiting the effectiveness of traditional treatment. Therefore, this study aims to construct a novel CRC prognostic signature using machine learning algorithms to assist in making informed clinical decisions and improving treatment outcomes.
Gene expression matrix and clinical information of CRC patients were obtained from the The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Then, genes with prognostic value were identified through univariate Cox regression analysis. Next, nine machine learning algorithms, including least absolute shrinkage and selection operator (LASSO), gradient boosting machine (GBM), CoxBoost, plsRcox, Ridge, Enet, StepCox, SuperPC and survivalSVM were integrated to form 97 combinations, which was employed to screen the best strategy for building a prognostic model based on the average C-index in the three CRC cohorts. Kaplan Meier survival analysis, receiver operating curve (ROC) analysis and multivariate regression analysis were conducted to assess the predictive performance of the constructed signature. Furthermore, the CIBERSORT and ESTIMATE algorithms were utilized to quantify the infiltration level of immune cells. Besides, a nomogram were developed to predict 1-, 2-, and 3-year overall survival (OS) probabilities for individual patient.
A prognostic signature consisting of 13 genes was developed utilizing LASSO Cox regression and GBM methods. Across both the training and validation datasets, the performance evaluation consistently indicated the signature's capacity to accurately predict the prognosis of CRC patients. Especially, compared with 30 published signatures, the 13-gene model exhibited dramatically superior predictive power. Even within clinical subgroups, it could still precisely stratify the prognosis. Functional analysis revealed a robust association between the signature and the immune status as well as chemotherapy response in CRC patients. Furthermore, a nomogram was created based on the signature-derived risk score, which demonstrated a strong predictive ability for OS in CRC patients.
The 13-gene prognostic signature is expected to be a valuable tool for risk stratification, survival prediction, and treatment evaluation of patients with CRC.
结直肠癌(CRC)是全球常见的肠道恶性肿瘤,对公众健康构成严重威胁。由于其高度异质性,不同CRC患者的预后和药物反应差异很大,限制了传统治疗的有效性。因此,本研究旨在使用机器学习算法构建一种新的CRC预后特征,以协助做出明智的临床决策并改善治疗结果。
从癌症基因组图谱(TCGA)和基因表达综合数据库(GEO)中获取CRC患者的基因表达矩阵和临床信息。然后,通过单变量Cox回归分析确定具有预后价值的基因。接下来,整合包括最小绝对收缩和选择算子(LASSO)、梯度提升机(GBM)、CoxBoost、plsRcox、岭回归、弹性网络、逐步Cox回归、SuperPC和生存支持向量机(survivalSVM)在内的九种机器学习算法,形成97种组合,用于根据三个CRC队列中的平均C指数筛选构建预后模型的最佳策略。进行Kaplan-Meier生存分析、受试者工作特征曲线(ROC)分析和多变量回归分析,以评估构建的特征的预测性能。此外,利用CIBERSORT和ESTIMATE算法量化免疫细胞的浸润水平。此外,还开发了一个列线图来预测个体患者1年、2年和3年的总生存(OS)概率。
利用LASSO Cox回归和GBM方法开发了一个由13个基因组成的预后特征。在训练和验证数据集中,性能评估一致表明该特征能够准确预测CRC患者的预后。特别是,与30个已发表的特征相比,13基因模型表现出显著优越的预测能力。即使在临床亚组中,它仍然可以精确地对预后进行分层。功能分析揭示了该特征与CRC患者的免疫状态以及化疗反应之间的密切关联。此外,基于特征衍生的风险评分创建了一个列线图,该列线图对CRC患者的OS具有很强的预测能力。
13基因预后特征有望成为CRC患者风险分层、生存预测和治疗评估的有价值工具。