Zhu Jun, Tao Jiayu, Zhang Fengfeng, Yao Jie, Chen Ke, Wang Yuxuan, Lu Xiaochen, Ni Bin, Zhu Maoshan
Department of Thoracic Surgery, the First Affiliated Hospital of Soochow University, Suzhou, China.
Department of Oncology, the First Affiliated Hospital of Soochow University, Suzhou, China.
J Thorac Dis. 2025 Apr 30;17(4):2423-2440. doi: 10.21037/jtd-2025-310. Epub 2025 Apr 28.
BACKGROUND: Lung adenocarcinoma (LUAD) is the most frequently diagnosed subtype of non-small cell lung cancer (NSCLC). Notably, prognosis can vary significantly among LUAD patients with different tumor subtypes. The advent of radiomics and machine learning (ML) technologies enables the development of non-invasive pathology predictive models. We attempted to develop computed tomography (CT) radiomics-based diagnostic models, enhanced by ML, to predict LUAD malignancy grade and guide surgical strategies. METHODS: In this retrospective analysis, a total of 168 surgical patients with histology-confirmed LUAD were divided into low-risk group (n=93) and intermediate-to-high-risk group (n=75) based on postoperative pathology. The region of interest (ROI) was delineated on the preoperative CT images for all patients, followed by the extraction of radiomic features. Patients were randomly allocated to a training set (n=117) and a testing set (n=51) in a 7:3 ratio. Within the training set, clinical-radiological model (CM) and radiomics model (RM) were developed utilizing patients' clinical characteristics, radiological semantic features, and radiomic features, along with the calculation of Rad scores. After the Rad scores were combined with independent risk factors among clinical-radiological features, logistic regression (LR), decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), K-nearest neighbors (KNN), and naïve Bayes model (NBM) were employed to create different comprehensive models (COMs). The optimal model was identified based on the receiver operating characteristic (ROC) curves and the DeLong test. Finally, Shapley additive explanations (SHAP) were utilized to visualize the predictive processes of the models. RESULTS: Among the 168 patients enrolled, there were 50 males (29.76%) aged 56 (49.25, 67.00) years and 118 females (70.24%) aged 56.5 (42.00, 64.00) years; Diameter (P<0.001), and consolidation-to-tumor ratio (CTR) ≥0.5 (P=0.002) were identified as independent risk factors for the malignancy degree of LUAD during CM creation. The CM had an area under the ROC curve (AUC) of 0.909 [95% confidence interval (CI): 0.856-0.962] in the training set and 0.920 (95% CI: 0.846-0.994) in the testing set. The RM, comprising seven radiomic features, achieved an AUC of 0.961 (95% CI: 0.926-0.996) in the training set and 0.957 (95% CI: 0.905-1.000) in the testing set. Among models created using various ML algorithms, the XGBoost model was identified as the optimal model. SHAP visualization revealed the model prediction processes and the values of different features. CONCLUSIONS: We constructed and validated a robust, integrative model leveraging ML and CT radiomics, which amalgamates radiomics, clinical, and radiological attributes to precisely identify LUADs with elevated postoperative pathological grades. This enables doctors to formulate different surgical plans according to the pathology of the patients' tumors before the operation.
Bioengineering (Basel). 2023-8-16
Histopathology. 2024-1
Thorac Surg Clin. 2023-11
N Engl J Med. 2023-2-9