Shen Yao, Xu Fangyi, Zhu Wenchao, Hu Hongjie, Chen Ting, Li Qiang
Department of Radiology, Sir Run Run Shaw Hospital Affiliated with the School of Medicine of Zhejiang University, Hangzhou 310016, China.
Department of Radiology, Yinzhou Hospital Affiliated with the School of Medicine of Ningbo University, Ningbo 315040, China.
Ann Transl Med. 2020 Mar;8(5):171. doi: 10.21037/atm.2020.01.135.
To test the ability of a multiclassifier model based on radiomics features to predict benign and malignant primary pulmonary solid nodules.
Computed tomography (CT) images of 342 patients with primary pulmonary solid nodules confirmed by histopathology or follow-up were retrospectively analyzed. The region of interest (ROI) of the images was delineated, and the radiomics features of the lesions were extracted. The feature weight was calculated using the relief feature selection algorithm. Based on the selected features, five classifier models were constructed: support vector machine (SVM), random forest (RF), logistic regression (LR), extreme learning machine (ELM), and K-nearest neighbor (KNN). The precision, recall rate, and area under the receiver operating characteristic curve (AUC) were used to evaluate the prediction performance of each classifier. The prediction result of each classifier was first weighted, and then all the prediction results were fused to predict the nodule type of unknown images. The prediction precision, recall rate, and AUC of the fusion classifier and single classifier were compared. Cross-validation was used to evaluate the generalization of the fusion classifier, and - and F-tests were performed on the five classifiers and fusion classifier.
For each ROI, 450 features in four major categories were extracted and were analyzed using the relief feature selection algorithm. According to the weights, 25 highly repetitive and nonredundant stable features that played a major role in pulmonary nodule classification were selected. The fusion classifier's prediction performance (prediction precision =92.0%, AUC =0.915) was superior to those of SVM (prediction precision =75.3%, AUC =0.740), RF (prediction precision =89.1%, AUC =0.855), LR (prediction precision =68.4%, AUC =0.681), ELM (prediction precision =87.0%, AUC =0.830), and KNN (prediction precision =77.1%, AUC =0.702). The fusion classifier showed the best null hypothesis performance in the -test (P=0.035) and F-test (P=0.036).
The multiclassifier fusion model based on radiomics features had high prediction value for benign and malignant primary pulmonary solid nodules.
测试基于放射组学特征的多分类器模型预测原发性肺实性结节良恶性的能力。
回顾性分析342例经组织病理学或随访确诊的原发性肺实性结节患者的计算机断层扫描(CT)图像。勾勒图像的感兴趣区域(ROI),提取病变的放射组学特征。使用 Relief 特征选择算法计算特征权重。基于所选特征,构建了五个分类器模型:支持向量机(SVM)、随机森林(RF)、逻辑回归(LR)、极限学习机(ELM)和 K 近邻(KNN)。使用精度、召回率和受试者操作特征曲线下面积(AUC)来评估每个分类器的预测性能。首先对每个分类器的预测结果进行加权,然后融合所有预测结果以预测未知图像的结节类型。比较融合分类器和单分类器的预测精度、召回率和 AUC。采用交叉验证评估融合分类器的泛化能力,并对五个分类器和融合分类器进行 t 检验和 F 检验。
对于每个ROI,提取了四大类中的450个特征,并使用Relief特征选择算法进行分析。根据权重,选择了25个在肺结节分类中起主要作用的高度重复且无冗余的稳定特征。融合分类器的预测性能(预测精度=92.0%,AUC=0.915)优于支持向量机(SVM)(预测精度=75.3%,AUC=0.740)、随机森林(RF)(预测精度=89.1%,AUC=0.855)、逻辑回归(LR)(预测精度=68.4%,AUC=0.681)、极限学习机(ELM)(预测精度=87.0%,AUC=0.830)和K近邻(KNN)(预测精度=77.1%,AUC=0.702)。融合分类器在t检验(P=0.035)和F检验(P=0.036)中显示出最佳的零假设性能。
基于放射组学特征的多分类器融合模型对原发性肺实性结节的良恶性具有较高的预测价值。