Institute of Health Informatics, University College London, London, UK.
BMC Res Notes. 2021 Oct 2;14(1):385. doi: 10.1186/s13104-021-05789-0.
The objective of this study was to employ ensemble clustering and tree-based risk model approaches to identify interactions between clinicogenomic features for colorectal cancer using the 100,000 Genomes Project.
Among the 2211 patients with colorectal cancer (mean age of diagnosis: 67.7; 59.7% male), 16.3%, 36.3%, 39.0% and 8.4% had stage 1, 2, 3 and 4 cancers, respectively. Almost every patient had surgery (99.7%), 47.4% had chemotherapy, 7.6% had radiotherapy and 1.4% had immunotherapy. On average, tumour mutational burden (TMB) was 18 mutations/Mb and 34.4%, 31.3% and 25.7% of patients had structural or copy number mutations in KRAS, BRAF and NRAS, respectively. In the fully adjusted Cox model, patients with advanced cancer [stage 3 hazard ratio (HR) = 3.2; p < 0.001; stage 4 HR = 10.2; p < 0.001] and those who had immunotherapy (HR = 1.8; p < 0.04) or radiotherapy (HR = 1.5; p < 0.02) treatment had a higher risk of dying. The ensemble clustering approach generated four distinct clusters where patients in cluster 2 had the best survival outcomes (1-year: 98.7%; 2-year: 96.7%; 3-year: 93.0%) while patients in cluster 3 (1-year: 87.9; 2-year: 70.0%; 3-year: 53.1%) had the worst outcomes. Kaplan-Meier analysis and log rank test revealed that the clusters were separated into distinct prognostic groups (p < 0.0001). Survival tree or recursive partitioning analyses were performed to further explore risk groups within each cluster. Among patients in cluster 2, for example, interactions between cancer stage, grade, radiotherapy, TMB, BRAF mutation status were identified. Patients with stage 4 cancer and TMB ≥ 1.6 mutations/Mb had 4 times higher risk of dying relative to the baseline hazard in that cluster.
本研究旨在利用英国生物银行 10 万基因组计划的数据,采用集成聚类和基于树的风险模型方法,确定结直肠癌临床基因组特征之间的相互作用。
在 2211 例结直肠癌患者中(诊断时的平均年龄:67.7 岁;59.7%为男性),分别有 16.3%、36.3%、39.0%和 8.4%的患者分期为 1 期、2 期、3 期和 4 期。几乎所有患者都接受了手术(99.7%),47.4%接受了化疗,7.6%接受了放疗,1.4%接受了免疫治疗。平均肿瘤突变负荷(TMB)为 18 个突变/Mb,34.4%、31.3%和 25.7%的患者分别在 KRAS、BRAF 和 NRAS 中存在结构或拷贝数突变。在完全调整的 Cox 模型中,晚期癌症患者[3 期风险比(HR)=3.2;p<0.001;4 期 HR=10.2;p<0.001]和接受免疫治疗(HR=1.8;p<0.04)或放疗(HR=1.5;p<0.02)的患者死亡风险更高。集成聚类方法生成了四个不同的聚类,其中聚类 2 中的患者生存结局最好(1 年:98.7%;2 年:96.7%;3 年:93.0%),而聚类 3 中的患者(1 年:87.9%;2 年:70.0%;3 年:53.1%)的结局最差。Kaplan-Meier 分析和对数秩检验显示,聚类结果分为不同的预后组(p<0.0001)。进一步对每个聚类中的风险组进行了生存树或递归分割分析。例如,在聚类 2 中,确定了癌症分期、分级、放疗、TMB、BRAF 突变状态之间的相互作用。与该聚类中的基线风险相比,4 期癌症和 TMB≥1.6 个突变/Mb 的患者死亡风险增加了 4 倍。