Department of Radiation Oncology, Seoul National University Bundang Hospital, Seongnam, 13620, Korea.
Department of Radiation Oncology, Seoul National University, College of Medicine, Seoul, Korea.
Biomark Med. 2021 Nov;15(16):1529-1539. doi: 10.2217/bmm-2021-0280. Epub 2021 Oct 15.
We tested whether machine-learning algorithm could find biomarkers predicting overall survival in breast cancer patients using blood-based whole-exome sequencing data. Whole-exome sequencing data derived from 1181 female breast cancer patients within the UK Biobank was collected. We found feature genes (n = 50) regarding total mutation burden using the long short-term memory model. Then, we developed the XGBoost survival model with selected feature genes. The XGBoost survival model performed acceptably, with a concordance index of 0.75 and a scaled Brier score of 0.146 in terms of overall survival prediction. The high-mutation group exhibited inferior overall survival compared with the low-mutation group in patients ≥56 years (log-rank test, p = 0.042). We showed that machine-learning algorithms can be used to predict overall survival in breast cancer patients from blood-based whole-exome sequencing data.
我们测试了机器学习算法是否可以利用基于血液的全外显子组测序数据找到预测乳腺癌患者总生存期的生物标志物。从英国生物银行的 1181 名女性乳腺癌患者中收集了全外显子组测序数据。我们使用长短时记忆模型找到了与总突变负荷相关的特征基因(n=50)。然后,我们使用选定的特征基因开发了 XGBoost 生存模型。XGBoost 生存模型表现尚可,在总生存预测方面,一致性指数为 0.75,缩放 Brier 得分为 0.146。在年龄≥56 岁的患者中,高突变组的总生存明显劣于低突变组(对数秩检验,p=0.042)。我们表明,机器学习算法可用于从基于血液的全外显子组测序数据预测乳腺癌患者的总生存。