Department of Radiology, Research Institute of Radiological Science, Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, South Korea.
Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea.
Eur Radiol. 2024 May;34(5):3102-3112. doi: 10.1007/s00330-023-10338-3. Epub 2023 Oct 18.
To develop and validate a multiparametric MRI-based radiomics model with optimal oversampling and machine learning techniques for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC).
This retrospective, multicenter study included consecutive patients with newly diagnosed and pathologically confirmed OPSCC between January 2017 and December 2020 (110 patients in the training set, 44 patients in the external validation set). A total of 293 radiomics features were extracted from three sequences (T2-weighted images [T2WI], contrast-enhanced T1-weighted images [CE-T1WI], and ADC). Combinations of three feature selection, five oversampling, and 12 machine learning techniques were evaluated to optimize its diagnostic performance. The area under the receiver operating characteristic curve (AUC) of the top five models was validated in the external validation set.
A total of 154 patients (59.2 ± 9.1 years; 132 men [85.7%]) were included, and oversampling was employed to account for data imbalance between HPV-positive and HPV-negative OPSCC (86.4% [133/154] vs. 13.6% [21/154]). For the ADC radiomics model, the combination of random oversampling and ridge showed the highest diagnostic performance in the external validation set (AUC, 0.791; 95% CI, 0.775-0.808). The ADC radiomics model showed a higher trend in diagnostic performance compared to the radiomics model using CE-T1WI (AUC, 0.604; 95% CI, 0.590-0.618), T2WI (AUC, 0.695; 95% CI, 0.673-0.717), and a combination of both (AUC, 0.642; 95% CI, 0.626-0.657).
The ADC radiomics model using random oversampling and ridge showed the highest diagnostic performance in predicting the HPV status of OPSCC in the external validation set.
Among multiple sequences, the ADC radiomics model has a potential for generalizability and applicability in clinical practice. Exploring multiple oversampling and machine learning techniques was a valuable strategy for optimizing radiomics model performance.
• Previous radiomics studies using multiparametric MRI were conducted at single centers without external validation and had unresolved data imbalances. • Among the ADC, CE-T1WI, and T2WI radiomics models and the ADC histogram models, the ADC radiomics model was the best-performing model for predicting human papillomavirus status in oropharyngeal squamous cell carcinoma. • The ADC radiomics model with the combination of random oversampling and ridge showed the highest diagnostic performance.
开发并验证一种基于多参数 MRI 的放射组学模型,该模型采用最优过采样和机器学习技术,用于预测口咽鳞状细胞癌(OPSCC)中的人乳头瘤病毒(HPV)状态。
本回顾性多中心研究纳入了 2017 年 1 月至 2020 年 12 月期间新诊断和经病理证实的 OPSCC 连续患者(训练集中的 110 例患者,外部验证集中的 44 例患者)。从三个序列(T2 加权图像[T2WI]、增强 T1 加权图像[CE-T1WI]和 ADC)中提取了 293 个放射组学特征。评估了三种特征选择、五种过采样和十二种机器学习技术的组合,以优化其诊断性能。在外部验证集中验证了前五个模型的受试者工作特征曲线下面积(AUC)。
共纳入 154 例患者(59.2±9.1 岁;132 例男性[85.7%]),采用过采样方法来解决 HPV 阳性和 HPV 阴性 OPSCC 之间的数据不平衡问题(86.4%[133/154] vs. 13.6%[21/154])。对于 ADC 放射组学模型,随机过采样和岭回归的组合在外部验证集中表现出最高的诊断性能(AUC:0.791;95%CI:0.775-0.808)。与 CE-T1WI(AUC:0.604;95%CI:0.590-0.618)、T2WI(AUC:0.695;95%CI:0.673-0.717)和两者结合(AUC:0.642;95%CI:0.626-0.657)的放射组学模型相比,ADC 放射组学模型在诊断性能方面表现出了更高的趋势。
在外部验证集中,使用随机过采样和岭回归的 ADC 放射组学模型在预测 OPSCC 的 HPV 状态方面表现出了最高的诊断性能。
在多种序列中,ADC 放射组学模型在预测 OPSCC 的 HPV 状态方面具有潜在的泛化能力和临床适用性。探索多种过采样和机器学习技术是优化放射组学模型性能的有价值策略。
之前使用多参数 MRI 的放射组学研究均在单中心进行,未进行外部验证,且存在数据不平衡问题。
在 ADC、CE-T1WI 和 T2WI 放射组学模型以及 ADC 直方图模型中,ADC 放射组学模型是预测口咽鳞状细胞癌 HPV 状态的最佳模型。
采用随机过采样和岭回归相结合的 ADC 放射组学模型表现出了最高的诊断性能。