Wu Wen-Feng, Lai Kuan-Ming, Chen Chia-Hung, Wang Bai-Chuan, Chen Yi-Jen, Shen Chia-Wei, Chen Kai-Yan, Lin Eugene C, Chen Chien-Chin
Department of Radiology, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi, 600, Taiwan.
Central Taiwan University of Science and Technology Institute of Radiological Science, Taichung, 406, Taiwan.
Discov Oncol. 2024 Sep 14;15(1):447. doi: 10.1007/s12672-024-01333-1.
Early detection of T790M mutation in exon 20 of epidermal growth factor receptor (EGFR) in non-small cell lung cancer (NSCLC) patients with brain metastasis is crucial for optimizing treatment strategies. In this study, we developed radiomics models to distinguish NSCLC patients with T790M-positive mutations from those with T790M-negative mutations using multisequence MR images of brain metastasis despite an imbalanced dataset. Various resampling techniques and classifiers were employed to identify the most effective strategy.
Radiomic analyses were conducted on a dataset comprising 125 patients, consisting of 18 with EGFR T790M-positive mutations and 107 with T790M-negative mutations. Seventeen first- and second-order statistical features were selected from CET1WI, T2WI, T2FLAIR, and DWI images. Four classifiers (logistic regression, support vector machine, random forest [RF], and extreme gradient boosting [XGBoost]) were evaluated under 13 different resampling conditions.
The area under the curve (AUC) value achieved was 0.89, using the SVM-SMOTE oversampling method in combination with the XGBoost classifier. This performance was measured against the AUC reported in the literature, serving as an upper-bound reference. Additionally, comparable results were observed with other oversampling methods paired with RF or XGBoost classifiers.
Our study demonstrates that, even when dealing with an imbalanced EGFR T790M dataset, reasonable predictive outcomes can be achieved by employing an appropriate combination of resampling techniques and classifiers. This approach has significant potential for enhancing T790M mutation detection in NSCLC patients with brain metastasis.
对于非小细胞肺癌(NSCLC)脑转移患者,早期检测表皮生长因子受体(EGFR)第20外显子中的T790M突变对于优化治疗策略至关重要。在本研究中,我们开发了放射组学模型,尽管数据集不均衡,但仍使用脑转移的多序列MR图像来区分T790M阳性突变的NSCLC患者和T790M阴性突变的患者。采用了各种重采样技术和分类器来确定最有效的策略。
对包含125例患者的数据集进行放射组学分析,其中18例为EGFR T790M阳性突变,107例为T790M阴性突变。从CET1WI、T2WI、T2FLAIR和DWI图像中选择了17个一阶和二阶统计特征。在13种不同的重采样条件下评估了四种分类器(逻辑回归、支持向量机、随机森林[RF]和极端梯度提升[XGBoost])。
使用支持向量机-合成少数过采样技术(SVM-SMOTE)过采样方法结合XGBoost分类器获得的曲线下面积(AUC)值为0.89。该性能是与文献中报道的AUC进行比较来衡量的,作为上限参考。此外,使用其他过采样方法与RF或XGBoost分类器配对时也观察到了类似的结果。
我们的研究表明,即使处理不均衡的EGFR T790M数据集,通过采用重采样技术和分类器的适当组合也可以实现合理的预测结果。这种方法在增强NSCLC脑转移患者的T790M突变检测方面具有巨大潜力。