Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, No. 639 Zhizaoju Road, Shanghai, 200010, China.
Department of Radiology, Eye & ENT Hospital of Shanghai Medical School, Fudan University, No. 83 Fenyang Road, Shanghai, 200030, China.
Eur Radiol. 2020 Dec;30(12):6858-6866. doi: 10.1007/s00330-020-07011-4. Epub 2020 Jun 26.
To compare the CT texture feature reproducibility of 2D and 3D segmentations and their machine learning (ML)-based classifications for predicting human papilloma virus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC).
Data about 47 patients with pathological OPSCC (15 HPV positive and 32 HPV negative) were collected from a public database. Using 2D and 3D manual segmentations, 1032 texture features were extracted from contrast-enhanced CT images. Intraclass correlation coefficients (ICCs) were calculated to evaluate intraobserver and interobserver reproducibility. Collinearity analysis and a wrapper-based subset search algorithm were used for feature selection. Models were created using k-nearest neighbors (k-NN), logistic regression (LR), and random forest (RF) alone and with a synthetic minority oversampling technique (SMOTE). Classifier performance was assessed using 10-fold cross-validation.
Compared with 2D segmentation (468 of 1032, 45.3%), 3D segmentation (576 of 1032, 55.8%) yielded more texture features with reliable reproducibility (good to excellent in both intraobserver and interobserver analyses) (p < 0.001). RF and k-NN classifiers failed to achieve better classification performance using 3D features than using 2D features either alone or with SMOTE. The best models for 2D and 3D segmentations were both created using RF, which alone achieved areas under the curve (AUCs) of 0.880 and 0.847, respectively, and with SMOTE, AUCs of 0.953 and 0.920, respectively, were achieved.
Three-dimensional segmentation had better CT texture feature reproducibility, but 2D segmentation showed better performance. Considering the cost, 2D segmentation is more recommended for ML-based classification of HPV status of OPSCC.
• Three-dimensional segmentation had better CT texture feature reproducibility than 2D segmentation. • Despite yielding more features with reliable reproducibility, 3D segmentation failed to provide better classification performance as compared to 2D for predicting HPV status of oropharyngeal squamous cell carcinoma. • The best models for 2D and 3D segmentations were both created using random forest classifier.
比较二维和三维分割的 CT 纹理特征可重复性及其基于机器学习(ML)的分类,以预测口咽鳞状细胞癌(OPSCC)中的人乳头瘤病毒(HPV)状态。
从公共数据库中收集了 47 例经病理证实的 OPSCC 患者(15 例 HPV 阳性,32 例 HPV 阴性)的数据。使用二维和三维手动分割,从对比增强 CT 图像中提取了 1032 个纹理特征。计算了组内相关系数(ICC)以评估观察者内和观察者间的可重复性。使用共线性分析和基于包装器的子集搜索算法进行特征选择。使用 k-最近邻(k-NN)、逻辑回归(LR)和随机森林(RF)单独以及使用合成少数过采样技术(SMOTE)创建模型。使用 10 折交叉验证评估分类器性能。
与二维分割(468/1032,45.3%)相比,三维分割(576/1032,55.8%)产生了更多具有可靠可重复性的纹理特征(观察者内和观察者间分析均为良好至极好)(p<0.001)。RF 和 k-NN 分类器单独或与 SMOTE 一起使用三维特征无法实现比二维特征更好的分类性能。二维和三维分割的最佳模型均使用 RF 创建,单独使用时 AUC 分别为 0.880 和 0.847,与 SMOTE 一起使用时 AUC 分别为 0.953 和 0.920。
三维分割的 CT 纹理特征具有更好的可重复性,但二维分割显示出更好的性能。考虑到成本,二维分割更推荐用于基于 ML 的 OPSCC HPV 状态分类。
三维分割的 CT 纹理特征比二维分割更具可重复性。
尽管产生了具有可靠可重复性的更多特征,但与二维分割相比,三维分割在预测 OPSCC 的 HPV 状态方面并未提供更好的分类性能。
二维和三维分割的最佳模型均使用随机森林分类器创建。