Lu Mei, Wu Kuan-Han Hank, Trudeau Sheri, Jiang Margaret, Zhao Joe, Fan Elliott
Department of Public Health Sciences, Henry Ford Health System, 3E One Ford Place, Detroit, MI, 48202, USA.
Division of Data Analytics, Northern Medical Center, Middletown, NY, USA.
Sci Rep. 2020 Nov 25;10(1):20575. doi: 10.1038/s41598-020-77653-3.
Tumor mutational burden (TMB) is associated with clinical response to immunotherapy, but application has been limited to a subset of cancer patients. We hypothesized that advanced machine-learning and proper modeling could identify mutations that classify patients most likely to derive clinical benefits. Training data: Two sets of public whole-exome sequencing (WES) data for metastatic melanoma. Validation data: One set of public non-small cell lung cancer (NSCLC) data. Least Absolute Shrinkage and Selection Operator (LASSO) machine-learning and proper modeling were used to identify a set of mutations (biomarker) with maximum predictive accuracy (measured by AUROC). Kaplan-Meier and log-rank methods were used to test prediction of overall survival. The initial model considered 2139 mutations. After pruning, 161 mutations (11%) were retained. An optimal threshold of 0.41 divided patients into high-weight (HW) or low-weight (LW) TMB groups. Classification for HW-TMB was 100% (AUROC = 1.0) on melanoma learning/testing data; HW-TMB was a prognostic marker for longer overall survival. In validation data, HW-TMB was associated with survival (p = 0.0057) and predicted 6-month clinical benefit (AUROC = 0.83) in NSCLC. In conclusion, we developed and validated a 161-mutation genomic signature with "outstanding" 100% accuracy to classify melanoma patients by likelihood of response to immunotherapy. This biomarker can be adapted for clinical practice to improve cancer treatment and care.
肿瘤突变负荷(TMB)与免疫治疗的临床反应相关,但应用仅限于部分癌症患者。我们假设先进的机器学习和适当的建模可以识别出能够对最有可能从临床获益的患者进行分类的突变。训练数据:两组转移性黑色素瘤的公开全外显子测序(WES)数据。验证数据:一组公开的非小细胞肺癌(NSCLC)数据。使用最小绝对收缩和选择算子(LASSO)机器学习和适当的建模来识别一组具有最大预测准确性(通过曲线下面积(AUROC)衡量)的突变(生物标志物)。使用Kaplan-Meier和对数秩检验方法来测试总生存预测。初始模型考虑了2139个突变。经过筛选后,保留了161个突变(11%)。0.41的最佳阈值将患者分为高权重(HW)或低权重(LW)TMB组。在黑色素瘤学习/测试数据上,HW-TMB的分类准确率为100%(AUROC = 1.0);HW-TMB是总生存期更长的预后标志物。在验证数据中,HW-TMB与NSCLC患者的生存相关(p = 0.0057),并预测了6个月的临床获益(AUROC = 0.83)。总之,我们开发并验证了一种具有161个突变的基因组特征,其准确率高达100%,可根据免疫治疗反应的可能性对黑色素瘤患者进行分类。这种生物标志物可应用于临床实践,以改善癌症治疗和护理。