Lin Jianing, Yan Zhihang, He Longyu, Zhang Hao, Xie Mingxuan
Department of Geriatric Pulmonary and Critical Care Medicine, Xiangya Hospital, Central South University; National Clinical Research Center for Geriatric Disorders (Xiangya Hospital), Changsha 410008.
School of Electronic Information, Central South University, Changsha 410075, China.
Zhong Nan Da Xue Xue Bao Yi Xue Ban. 2025 May 28;50(5):805-814. doi: 10.11817/j.issn.1672-7347.2025.250026.
Non-small cell lung cancer (NSCLC) is associated with poor prognosis, with 30% of patients diagnosed at an advanced stage. Mutations in the and genes are important prognostic factors for NSCLC, and targeted therapies can significantly improve survival in these patients. Although tissue biopsy remains the gold standard for detecting gene mutations, it has limitations, including invasiveness, sampling errors due to tumor heterogeneity, and poor reproducibility. This study aims to develop machine learning models based on radiomic features to predict and gene mutation status in NSCLC patients, thereby providing a reference for precision oncology.
Imaging and mutation data from eligible NSCLC patients were obtained from the publicly available Lung-PET-CT-Dx dataset in The Cancer Imaging Archive (TCIA). A three-dimensional-convolutional neural network (3D-CNN) was used to extract imaging features from the regions of interest (ROI). The LightGBM algorithm was employed to build classification models for predicting and gene mutation status. Model performance was evaluated using 5-fold cross-validation, with receiver operator characteristic (ROC) curves, area under the curve (AUC), accuracy, sensitivity, and specificity used for validation.
The models effectively predicted and mutations in NSCLC patients, achieving an AUC of 0.95 for mutations and 0.90 for . The models also demonstrated high accuracy ( 89.66%; 87.10%), sensitivity ( 93.33%; 87.50%), and specificity ( 85.71%; 86.67%).
A radiogenomics-machine learning predictive model can serve as a non-invasive tool for anticipating and gene mutation status in NSCLC patients.
非小细胞肺癌(NSCLC)预后较差,30%的患者在晚期被诊断出来。 和 基因的突变是非小细胞肺癌重要的预后因素,靶向治疗可显著提高这些患者的生存率。尽管组织活检仍是检测基因突变的金标准,但它存在局限性,包括侵入性、肿瘤异质性导致的采样误差以及可重复性差。本研究旨在基于放射组学特征开发机器学习模型,以预测NSCLC患者的 和 基因突变状态,从而为精准肿瘤学提供参考。
从癌症影像存档(TCIA)中公开可用的Lung-PET-CT-Dx数据集中获取符合条件的NSCLC患者的影像和突变数据。使用三维卷积神经网络(3D-CNN)从感兴趣区域(ROI)提取影像特征。采用LightGBM算法构建预测 和 基因突变状态的分类模型。使用5折交叉验证评估模型性能,采用受试者操作特征(ROC)曲线、曲线下面积(AUC)、准确率、敏感性和特异性进行验证。
这些模型有效地预测了NSCLC患者的 和 突变, 突变的AUC为0.95, 突变的AUC为0.90。模型还表现出较高的准确率( 89.66%; 87.10%)、敏感性( 93.33%; 87.50%)和特异性( 85.71%; 86.67%)。
放射基因组学-机器学习预测模型可作为一种非侵入性工具,用于预测NSCLC患者的 和 基因突变状态。