Gresser Eva, Schachtner Balthasar, Stüber Anna Theresa, Solyanik Olga, Schreier Andrea, Huber Thomas, Froelich Matthias Frank, Magistro Giuseppe, Kretschmer Alexander, Stief Christian, Ricke Jens, Ingrisch Michael, Nörenberg Dominik
Department of Radiology, University Hospital, LMU Munich, Munich, Germany.
Comprehensive Pneumology Center (CPC-M), Member of the German Center for Lung Research (DZL), Munich, Germany.
Quant Imaging Med Surg. 2022 Nov;12(11):4990-5003. doi: 10.21037/qims-22-265.
Radiomics promises to enhance the discriminative performance for clinically significant prostate cancer (csPCa), but still lacks validation in real-life scenarios. This study investigates the classification performance and robustness of machine learning radiomics models in heterogeneous MRI datasets to characterize suspicious prostate lesions for non-invasive prediction of prostate cancer (PCa) aggressiveness compared to conventional imaging biomarkers.
A total of 142 patients with clinical suspicion of PCa underwent 1.5T or 3T biparametric MRI (7 scanner types, 14 institutions) and exhibited suspicious lesions [prostate Imaging Reporting and Data System (PI-RADS) score ≥3] in peripheral or transitional zones. Whole-gland and index-lesion segmentations were performed semi-automatically. A total of 1,482 quantitative morphologic, shape, texture, and intensity-based radiomics features were extracted from T2-weighted and apparent diffusion coefficient (ADC)-images and assessed using random forest and logistic regression models. Five-fold cross-validation performance in terms of area under the ROC curve was compared to mean ADC (mADC), PI-RADS and prostate-specific antigen density (PSAD). Bias mitigation techniques targeting the high-dimensional feature space and inherent class imbalance were applied and robustness of results was systematically evaluated.
Trained models showed mean area under the curves (AUCs) ranging from 0.78 to 0.83 in csPCa classification. Despite using mitigation techniques, high performance variability of results could be demonstrated. Trained models achieved on average numerically higher classification performance compared to clinical parameters PI-RADS (AUC =0.78), mADC (AUC =0.71) and PSAD (AUC =0.63).
Radiomics models' classification performance of csPCa was numerically but not significantly higher than PI-RADS scoring. Overall, clinical applicability in heterogeneous MRI datasets is limited because of high variability of results. Performance variability, robustness and reproducibility of radiomics-based measures should be addressed more transparently in future research to enable broad clinical application.
放射组学有望提高对临床显著前列腺癌(csPCa)的鉴别性能,但在现实场景中仍缺乏验证。本研究调查了机器学习放射组学模型在异质性MRI数据集中的分类性能和稳健性,以表征可疑前列腺病变,用于与传统影像生物标志物相比的前列腺癌(PCa)侵袭性的无创预测。
共有142例临床怀疑患有PCa的患者接受了1.5T或3T双参数MRI检查(7种扫描仪类型,14个机构),在外周区或移行区发现可疑病变[前列腺影像报告和数据系统(PI-RADS)评分≥3]。对整个腺体和索引病变进行半自动分割。从T2加权和表观扩散系数(ADC)图像中提取了总共1482个基于定量形态、形状、纹理和强度的放射组学特征,并使用随机森林和逻辑回归模型进行评估。将ROC曲线下面积方面的五重交叉验证性能与平均ADC(mADC)、PI-RADS和前列腺特异性抗原密度(PSAD)进行比较。应用了针对高维特征空间和固有类不平衡的偏差缓解技术,并系统地评估了结果的稳健性。
训练后的模型在csPCa分类中的曲线下平均面积(AUC)范围为0.78至0.83。尽管使用了缓解技术,但仍可证明结果存在较高的性能变异性。与临床参数PI-RADS(AUC =0.78)、mADC(AUC =0.71)和PSAD(AUC =0.63)相比,训练后的模型平均在数值上实现了更高的分类性能。
放射组学模型对csPCa的分类性能在数值上高于PI-RADS评分,但无显著差异。总体而言,由于结果变异性高,在异质性MRI数据集中的临床适用性有限。基于放射组学的测量的性能变异性、稳健性和可重复性应在未来研究中更透明地加以解决,以实现广泛的临床应用。