Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, CH-1211, Geneva 4, Switzerland.
Department of Theoretical Physics and Center for Astroparticle Physics, University of Geneva, Geneva, Switzerland.
Eur J Nucl Med Mol Imaging. 2023 Mar;50(4):1034-1050. doi: 10.1007/s00259-022-06053-8. Epub 2022 Dec 12.
Attenuation correction and scatter compensation (AC/SC) are two main steps toward quantitative PET imaging, which remain challenging in PET-only and PET/MRI systems. These can be effectively tackled via deep learning (DL) methods. However, trustworthy, and generalizable DL models commonly require well-curated, heterogeneous, and large datasets from multiple clinical centers. At the same time, owing to legal/ethical issues and privacy concerns, forming a large collective, centralized dataset poses significant challenges. In this work, we aimed to develop a DL-based model in a multicenter setting without direct sharing of data using federated learning (FL) for AC/SC of PET images.
Non-attenuation/scatter corrected and CT-based attenuation/scatter corrected (CT-ASC) F-FDG PET images of 300 patients were enrolled in this study. The dataset consisted of 6 different centers, each with 50 patients, with scanner, image acquisition, and reconstruction protocols varying across the centers. CT-based ASC PET images served as the standard reference. All images were reviewed to include high-quality and artifact-free PET images. Both corrected and uncorrected PET images were converted to standardized uptake values (SUVs). We used a modified nested U-Net utilizing residual U-block in a U-shape architecture. We evaluated two FL models, namely sequential (FL-SQ) and parallel (FL-PL) and compared their performance with the baseline centralized (CZ) learning model wherein the data were pooled to one server, as well as center-based (CB) models where for each center the model was built and evaluated separately. Data from each center were divided to contribute to training (30 patients), validation (10 patients), and test sets (10 patients). Final evaluations and reports were performed on 60 patients (10 patients from each center).
In terms of percent SUV absolute relative error (ARE%), both FL-SQ (CI:12.21-14.81%) and FL-PL (CI:11.82-13.84%) models demonstrated excellent agreement with the centralized framework (CI:10.32-12.00%), while FL-based algorithms improved model performance by over 11% compared to CB training strategy (CI: 22.34-26.10%). Furthermore, the Mann-Whitney test between different strategies revealed no significant differences between CZ and FL-based algorithms (p-value > 0.05) in center-categorized mode. At the same time, a significant difference was observed between the different training approaches on the overall dataset (p-value < 0.05). In addition, voxel-wise comparison, with respect to reference CT-ASC, exhibited similar performance for images predicted by CZ (R = 0.94), FL-SQ (R = 0.93), and FL-PL (R = 0.92), while CB model achieved a far lower coefficient of determination (R = 0.74). Despite the strong correlations between CZ and FL-based methods compared to reference CT-ASC, a slight underestimation of predicted voxel values was observed.
Deep learning-based models provide promising results toward quantitative PET image reconstruction. Specifically, we developed two FL models and compared their performance with center-based and centralized models. The proposed FL-based models achieved higher performance compared to center-based models, comparable with centralized models. Our work provided strong empirical evidence that the FL framework can fully benefit from the generalizability and robustness of DL models used for AC/SC in PET, while obviating the need for the direct sharing of datasets between clinical imaging centers.
衰减校正和散射补偿(AC/SC)是定量 PET 成像的两个主要步骤,在 PET 仅和 PET/MRI 系统中仍然具有挑战性。这些可以通过深度学习(DL)方法有效地解决。然而,可靠且可推广的 DL 模型通常需要来自多个临床中心的精心策划、异构和大型数据集。同时,由于法律/道德问题和隐私问题,形成一个大型的、集中的数据集具有很大的挑战性。在这项工作中,我们旨在开发一种基于 DL 的模型,在不通过联邦学习(FL)直接共享数据的情况下,在多中心环境中进行 PET 图像的 AC/SC。
本研究纳入了 300 名患者的非衰减/散射校正和 CT 衰减/散射校正(CT-ASC)F-FDG PET 图像。该数据集由 6 个不同的中心组成,每个中心有 50 名患者,每个中心的扫描仪、图像采集和重建方案都不同。CT 衰减的 ASC PET 图像作为标准参考。所有图像都经过审查,包括高质量和无伪影的 PET 图像。校正和未校正的 PET 图像均转换为标准化摄取值(SUVs)。我们使用了一种修改后的嵌套 U-Net,在 U 形结构中利用了残差 U 块。我们评估了两种 FL 模型,即顺序(FL-SQ)和并行(FL-PL),并将其性能与基线集中学习模型(数据集中到一个服务器)以及中心基础模型(每个中心分别构建和评估模型)进行比较。每个中心的数据分为训练集(30 名患者)、验证集(10 名患者)和测试集(10 名患者)。最终的评估和报告是在 60 名患者(每个中心 10 名患者)上进行的。
在 SUV 绝对相对误差(ARE%)的百分比方面,FL-SQ(CI:12.21-14.81%)和 FL-PL(CI:11.82-13.84%)模型与集中式框架(CI:10.32-12.00%)具有极好的一致性,而与基于 CB 的训练策略(CI:22.34-26.10%)相比,FL 算法提高了模型性能超过 11%。此外,在基于中心分类的模式下,不同策略之间的 Mann-Whitney 检验表明,CZ 和基于 FL 的算法之间没有显著差异(p 值>0.05)。同时,在整个数据集上,不同的训练方法之间存在显著差异(p 值<0.05)。此外,与参考 CT-ASC 相比,体素级别的比较显示 CZ(R=0.94)、FL-SQ(R=0.93)和 FL-PL(R=0.92)预测的图像具有相似的性能,而 CB 模型的决定系数(R=0.74)则低得多。尽管与参考 CT-ASC 相比,CZ 和基于 FL 的方法之间具有很强的相关性,但观察到预测体素值的轻微低估。
基于深度学习的模型为定量 PET 图像重建提供了有希望的结果。具体来说,我们开发了两种 FL 模型,并将它们的性能与基于中心的和集中式模型进行了比较。与基于中心的模型相比,所提出的基于 FL 的模型具有更高的性能,与集中式模型相当。我们的工作提供了强有力的经验证据,表明 FL 框架可以充分受益于用于 PET 中 AC/SC 的 DL 模型的通用性和稳健性,同时避免了临床成像中心之间直接共享数据集的需要。