Castillo T Jose M, Arif Muhammad, Starmans Martijn P A, Niessen Wiro J, Bangma Chris H, Schoots Ivo G, Veenland Jifke F
Department of Radiology and Nuclear Medicine, Erasmus MC, 3015 GD Rotterdam, The Netherlands.
Faculty of Applied Sciences, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands.
Cancers (Basel). 2021 Dec 21;14(1):12. doi: 10.3390/cancers14010012.
The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center ( = 371 patients), and two external sets of which one was a publicly available patient cohort ( = 195 patients) and the other contained data from patients from two hospitals ( = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts ( = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts ( = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.
前列腺多参数磁共振成像(mpMRI)的计算机辅助分析可改善显著前列腺癌(PCa)的检测。文献中已报道了多种基于深度学习和放射组学的显著PCa分割或分类方法。为了能够评估这些方法性能的可推广性,使用各种外部数据集至关重要。虽然深度学习和放射组学方法都已基于一个中心的相同数据集进行了比较,但缺乏对这两种方法在来自不同中心和不同扫描仪的各种数据集上的性能比较。本研究的目的是比较深度学习模型与放射组学模型在不同患者队列的显著PCa诊断中的性能。我们纳入了来自我们自己中心的两个连续患者队列(n = 371例患者),以及两个外部数据集,其中一个是公开可用的患者队列(n = 195例患者),另一个包含来自两家医院患者的数据(n = 79例患者)。使用多参数磁共振成像(mpMRI),收集了所有患者的放射科医生肿瘤勾画和病理报告。在训练过程中,我们的一个患者队列(n = 271例患者)用于深度学习和放射组学模型的开发,其余三个队列(n = 374例患者)作为未见过的测试集保留。根据模型在受试者操作特征曲线(AUC)下的面积评估模型性能。虽然内部交叉验证显示深度学习方法的AUC更高,但放射组学模型在独立测试集上的AUC分别为0.88、0.91和0.65,而深度学习模型的AUC分别为0.70、0.73和0.44。与全自动深度学习模型相比,我们基于勾画区域的放射组学模型在三个未见过的测试集中为显著PCa分类提供了更准确的工具。