Chen Rui, Zhang Hu, Huang Xingwen, Han Haitao, Jian Jinbo
Department of Oncology, Binzhou Medical University Hospital, Binzhou, Shandong 256603, PR China.
Department of Radiology, Binzhou Medical University Hospital, Binzhou, Shandong 256603, PR China.
Eur J Radiol Open. 2025 Aug 23;15:100680. doi: 10.1016/j.ejro.2025.100680. eCollection 2025 Dec.
To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).
A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models-logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost-were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.
Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004-0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.
The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.
开发并验证基于CT影像组学的机器学习模型,以提高鉴别肺磨玻璃结节(GGN)病理亚型的能力。
对2020年1月1日至2023年5月31日滨州医学院附属医院392例肺腺癌患者的临床资料和影像学图像进行回顾性分析。所有患者均接受术前胸部薄层CT扫描及手术切除。共纳入400个GGN。在显示病变最大直径的层面上勾勒出感兴趣区(ROI)。根据病理确诊结果,将结节分为两组:第1组(原位腺癌,AIS或微浸润腺癌,MIA,209个结节)和第2组(浸润性腺癌,IAC,191个结节)。数据集按7:3的比例随机分为训练集(280个结节,70%)和验证集(120个结节,30%)。在训练集中,采用最小冗余最大相关(mRMR)以及最小绝对收缩和选择算子(LASSO)进行特征降维,以筛选出具有鉴别力的影像组学特征。然后构建了7种机器学习模型——逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、极端随机树、XGBoost、梯度提升和AdaBoost。使用受试者操作特征(ROC)曲线,基于曲线下面积(AUC)、准确率、特异性和灵敏度等指标评估模型性能和预测效能。
最终确定了8个影像组学特征。在7种模型中,梯度提升模型表现最佳,在训练集中的AUC为0.929(95%CI:0.9004 - 0.9584),准确率为0.85,灵敏度为0.851,特异性为0.849。
基于CT影像组学特征的梯度提升模型在预测磨玻璃结节型肺腺癌病理亚型方面表现出卓越性能,为临床诊断提供了可靠的辅助工具。