Ikushima H, Watanabe K, Shinozaki-Ushiku A, Oda K, Kage H
Department of Respiratory Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
Department of Respiratory Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Next-Generation Precision Medicine Development Laboratory, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
ESMO Open. 2024 Dec;9(12):103998. doi: 10.1016/j.esmoop.2024.103998. Epub 2024 Nov 25.
The low probability of identifying druggable mutations through comprehensive genomic profiling (CGP) and its financial and time costs hinder its widespread adoption. To enhance the effectiveness and efficiency of cancer precision medicine, it is critical to identify patient characteristics that are most likely to benefit from CGP.
This nationwide retrospective study employed machine learning models to predict the identification of genome-matched therapies by CGP, utilizing a national database covering 99.7% of the patients who underwent CGP in Japan from June 2019 to November 2023. Prediction models were constructed for the overall cancer population, specific cancer types, and adolescent and young adult (AYA) group. The SHapley Additive exPlanations (SHAP) algorithm was applied to elucidate clinical features contributing to model predictions.
This study included 60 655 patients [mean age (standard deviation), 60.8 years (14.5 years); 50.1% males]. CGP identified at least one genome-matched therapy in 11 227 cases (18.5%). The best prediction model was eXtreme Gradient Boosting (XGBoost) with an area under the receiver operating characteristic curve of 0.819. Cancer type was the most important predictor (negative for pancreas and positive for breast and lung), followed by the age, presence of liver metastasis, and number of metastatic sites. Analysis of cancer type-specific models identified several organ-specific features, including the sex, interval between the cancer diagnosis and CGP, sampling site, and CGP panel. Among 3455 AYA patients, genome-matched therapies were identified in 459 patients (13.3%). The AYA-specific model achieved an area under the receiver operating characteristic curve of 0.768, with bone tumor identified as a negative predictor in addition to those identified in the overall cancer population model.
Several factors predicting the identification of genome-matched therapies through CGP were identified for the overall cancer population and cancer type-specific subpopulations. Expedited CGP is recommended for patients who match the identified profile to facilitate early targeted therapy.
通过全面基因组分析(CGP)识别可靶向突变的概率较低,且其存在财务和时间成本,这阻碍了其广泛应用。为提高癌症精准医学的有效性和效率,识别最有可能从CGP中获益的患者特征至关重要。
这项全国性回顾性研究采用机器学习模型来预测CGP对基因组匹配疗法的识别情况,利用一个覆盖2019年6月至2023年11月在日本接受CGP的99.7%患者的全国性数据库。针对总体癌症人群、特定癌症类型以及青少年和青年(AYA)组构建了预测模型。应用SHapley加性解释(SHAP)算法来阐明有助于模型预测的临床特征。
本研究纳入了60655例患者[平均年龄(标准差),60.8岁(14.5岁);50.1%为男性]。CGP在11227例病例(18.5%)中识别出至少一种基因组匹配疗法。最佳预测模型是极端梯度提升(XGBoost),其受试者工作特征曲线下面积为0.819。癌症类型是最重要的预测因素(胰腺为阴性,乳腺和肺为阳性),其次是年龄、肝转移的存在情况以及转移部位数量。对癌症类型特异性模型的分析确定了几个器官特异性特征,包括性别、癌症诊断与CGP之间的间隔、采样部位以及CGP检测板。在3455例AYA患者中,459例患者(13.3%)识别出了基因组匹配疗法。AYA特异性模型的受试者工作特征曲线下面积为0.768,除了在总体癌症人群模型中确定的那些因素外,骨肿瘤被确定为阴性预测因素。
针对总体癌症人群和癌症类型特异性亚人群,确定了几个通过CGP识别基因组匹配疗法的预测因素。建议对符合已确定特征的患者加快进行CGP,以促进早期靶向治疗。