Suppr超能文献

多中心数据集对使用卷积神经网络在F-PSMA-1007 PET上进行原发性前列腺癌自动肿瘤勾画的影响。

The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on F-PSMA-1007 PET.

作者信息

Holzschuh Julius C, Mix Michael, Freitag Martin T, Hölscher Tobias, Braune Anja, Kotzerke Jörg, Vrachimis Alexis, Doolan Paul, Ilhan Harun, Marinescu Ioana M, Spohn Simon K B, Fechter Tobias, Kuhn Dejan, Gratzke Christian, Grosu Radu, Grosu Anca-Ligia, Zamboglou C

机构信息

Department of Radiation Oncology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, German Cancer Consortium (DKTK), Partner Site DKTK, Freiburg, Germany.

Division of Radiology, German Cancer Research Center (DKFZ), Heidelberg, Germany.

出版信息

Radiat Oncol. 2024 Aug 7;19(1):106. doi: 10.1186/s13014-024-02491-w.

Abstract

PURPOSE

Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.

METHODS

nnU-Net is trained using a dataset comprising 161 F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.

RESULTS

The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).

CONCLUSION

CNNs trained for auto contouring intraprostatic GTV in F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.

摘要

目的

卷积神经网络(CNNs)已成为放射肿瘤学领域的变革性工具,显著提高了轮廓勾画实践的精度。然而,这些算法在不同扫描仪、机构和成像协议之间的适应性仍然是一个相当大的障碍。本研究旨在探讨将机构特定数据集纳入卷积神经网络训练方案的效果,以评估其在现实临床环境中的泛化能力。以数据为中心进行分析,研究了不同的多中心和单中心训练方法对算法性能的影响。

方法

使用一个包含从四个不同机构收集的161张F-PSMA-1007 PET图像的数据集对nnU-Net进行训练(弗莱堡:n = 96,慕尼黑:n = 19,塞浦路斯:n = 32,德累斯顿:n = 14)。对数据集进行划分,以便将每个中心的数据系统地排除在训练之外,仅用于测试,以评估模型对来自不熟悉来源的数据的泛化能力和适应性。通过5折交叉验证比较性能,详细比较在单中心数据集上训练的模型与在聚合多中心数据集上训练的模型。骰子相似性分数、豪斯多夫距离和体积分析用作主要评估指标。

结果

在五折交叉验证中,混合训练方法产生的中位数DSC为0.76(IQR:0.64 - 0.84),与排除每个中心数据进行训练的模型相比无显著差异(p = 0.18),后者的中位数DSC为0.74(IQR:0.56 - 0.86)。在德累斯顿队列(多中心中位数DSC 0.71,IQR:0.58 - 0.80 vs. 单中心0.68,IQR:0.50 - 0.80,p < 0.001)和塞浦路斯队列(多中心0.74,IQR:0.62 - 0.83 vs. 单中心0.72,IQR:0.54 - 0.82,p < 0.01)中观察到多中心训练在性能上有显著提升。虽然慕尼黑和弗莱堡在多中心训练中也表现出性能提升,但结果无统计学意义(慕尼黑:多中心DSC 0.74,IQR:0.60 - 0.80 vs. 单中心0.72,IQR:0.59 - 0.82,p > 0.05;弗莱堡:多中心0.78,IQR:0.53 - 0.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89ee/11304577/0390c72e1057/13014_2024_2491_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验