Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 106, Taiwan.
Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 106, Taiwan.
Int J Mol Sci. 2021 Aug 26;22(17):9254. doi: 10.3390/ijms22179254.
Early identification of epidermal growth factor receptor (EGFR) and Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations is crucial for selecting a therapeutic strategy for patients with non-small-cell lung cancer (NSCLC). We proposed a machine learning-based model for feature selection and prediction of EGFR and KRAS mutations in patients with NSCLC by including the least number of the most semantic radiomics features. We included a cohort of 161 patients from 211 patients with NSCLC from The Cancer Imaging Archive (TCIA) and analyzed 161 low-dose computed tomography (LDCT) images for detecting EGFR and KRAS mutations. A total of 851 radiomics features, which were classified into 9 categories, were obtained through manual segmentation and radiomics feature extraction from LDCT. We evaluated our models using a validation set consisting of 18 patients derived from the same TCIA dataset. The results showed that the genetic algorithm plus XGBoost classifier exhibited the most favorable performance, with an accuracy of 0.836 and 0.86 for detecting EGFR and KRAS mutations, respectively. We demonstrated that a noninvasive machine learning-based model including the least number of the most semantic radiomics signatures could robustly predict EGFR and KRAS mutations in patients with NSCLC.
早期识别表皮生长因子受体(EGFR)和 Kirsten 大鼠肉瘤病毒癌基因同源物(KRAS)突变对于为非小细胞肺癌(NSCLC)患者选择治疗策略至关重要。我们提出了一种基于机器学习的模型,通过纳入最少数量的最具语义的放射组学特征,用于选择 NSCLC 患者的 EGFR 和 KRAS 突变的特征选择和预测。我们纳入了来自癌症影像档案(TCIA)的 211 例 NSCLC 患者中的 161 例患者队列,并对 161 例低剂量计算机断层扫描(LDCT)图像进行分析,以检测 EGFR 和 KRAS 突变。通过手动分割和从 LDCT 中提取放射组学特征,共获得了 851 个放射组学特征,这些特征分为 9 类。我们使用来自同一 TCIA 数据集的 18 例患者组成的验证集来评估我们的模型。结果表明,遗传算法加 XGBoost 分类器表现出最佳性能,检测 EGFR 和 KRAS 突变的准确率分别为 0.836 和 0.86。我们证明了一种基于机器学习的非侵入性模型,通过纳入最少数量的最具语义的放射组学特征,可以稳健地预测 NSCLC 患者的 EGFR 和 KRAS 突变。