Department of Nuclear Medicine, West China Hospital, Sichuan University, 37# GuoXueLane, Chengdu, 610041, China.
Department of Biotherapy, Cancer Center, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, 37# GuoXueLane, Chengdu, 610041, China.
Eur J Nucl Med Mol Imaging. 2021 Aug;48(9):2904-2913. doi: 10.1007/s00259-021-05220-7. Epub 2021 Feb 5.
This study was designed and performed to assess the ability of F-fluorodeoxyglucose (FDG) positron emission tomography (PET) and computed tomography (CT) radiomics features combined with machine learning methods to differentiate between primary and metastatic lung lesions and to classify histological subtypes. Moreover, we identified the optimal machine learning method.
A total of 769 patients pathologically diagnosed with primary or metastatic lung cancers were enrolled. We used the LIFEx package to extract radiological features from semiautomatically segmented PET and CT images within the same volume of interest. Patients were randomly distributed in training and validation sets. Through the evaluation of five feature selection methods and nine classification methods, discriminant models were established. The robustness of the procedure was controlled by tenfold cross-validation. The model's performance was evaluated using the area under the receiver operating characteristic curve (AUC).
Based on the radiomics features extracted from PET and CT images, forty-five discriminative models were established. Combined with appropriate feature selection methods, most classifiers showed excellent discriminative ability with AUCs greater than 0.75. In the differentiation between primary and metastatic lung lesions, the feature selection method gradient boosting decision tree (GBDT) combined with the classifier GBDT achieved the highest classification AUC of 0.983 in the PET dataset. In contrast, the feature selection method eXtreme gradient boosting combined with the classifier random forest (RF) achieved the highest AUC of 0.828 in the CT dataset. In the discrimination between squamous cell carcinoma and adenocarcinoma, the combination of GBDT feature selection method with GBDT classification had the highest AUC of 0.897 in the PET dataset. In contrast, the combination of the GBDT feature selection method with the RF classification had the highest AUC of 0.839 in the CT dataset. Most of the decision tree (DT)-based models were overfitted, suggesting that the classification method was not appropriate for practical application.
F-FDG PET/CT radiomics features combined with machine learning methods can distinguish between primary and metastatic lung lesions and identify histological subtypes in lung cancer. GBDT and RF were considered optimal classification methods for the PET and CT datasets, respectively, and GBDT was considered the optimal feature selection method in our analysis.
本研究旨在评估 F-氟代脱氧葡萄糖(FDG)正电子发射断层扫描(PET)和计算机断层扫描(CT)放射组学特征结合机器学习方法在鉴别原发性和转移性肺病变以及分类组织学亚型方面的能力。此外,我们还确定了最佳的机器学习方法。
共纳入 769 例经病理诊断为原发性或转移性肺癌的患者。我们使用 LIFEx 包从同一感兴趣区域的半自动分割的 PET 和 CT 图像中提取放射学特征。患者随机分配到训练集和验证集中。通过评估五种特征选择方法和九种分类方法,建立判别模型。该过程的稳健性通过十折交叉验证进行控制。使用受试者工作特征曲线下面积(AUC)评估模型的性能。
基于从 PET 和 CT 图像中提取的放射组学特征,建立了 45 个判别模型。结合适当的特征选择方法,大多数分类器的 AUC 值大于 0.75,具有出色的判别能力。在原发性和转移性肺病变的鉴别中,在 PET 数据集上,梯度提升决策树(GBDT)特征选择方法与 GBDT 分类器相结合的分类器获得了最高的分类 AUC 值 0.983。相比之下,在 CT 数据集上,极限梯度提升(XGBoost)特征选择方法与随机森林(RF)分类器相结合的分类器获得了最高的 AUC 值 0.828。在鳞癌和腺癌的鉴别中,在 PET 数据集上,GBDT 特征选择方法与 GBDT 分类器相结合的模型获得了最高的 AUC 值 0.897。相比之下,在 CT 数据集上,GBDT 特征选择方法与 RF 分类器相结合的模型获得了最高的 AUC 值 0.839。大多数基于决策树(DT)的模型都存在过拟合,表明分类方法不适用于实际应用。
FDG PET/CT 放射组学特征结合机器学习方法可区分原发性和转移性肺病变,并可识别肺癌的组织学亚型。在 PET 和 CT 数据集中,GBDT 和 RF 分别被认为是最佳的分类方法,而在我们的分析中,GBDT 被认为是最佳的特征选择方法。