Nuclear Medicine - Radiopharmacology Department, Champalimaud Foundation, Av. Brasília, 1400-038, Lisbon, Portugal.
Hematology Department, Champalimaud Foundation, Av. Brasília, 1400-038, Lisbon, Portugal.
J Digit Imaging. 2023 Aug;36(4):1864-1876. doi: 10.1007/s10278-023-00823-y. Epub 2023 Apr 14.
The objective is to assess the performance of seven semiautomatic and two fully automatic segmentation methods on [F]FDG PET/CT lymphoma images and evaluate their influence on tumor quantification. All lymphoma lesions identified in 65 whole-body [F]FDG PET/CT staging images were segmented by two experienced observers using manual and semiautomatic methods. Semiautomatic segmentation using absolute and relative thresholds, k-means and Bayesian clustering, and a self-adaptive configuration (SAC) of k-means and Bayesian was applied. Three state-of-the-art deep learning-based segmentations methods using a 3D U-Net architecture were also applied. One was semiautomatic and two were fully automatic, of which one is publicly available. Dice coefficient (DC) measured segmentation overlap, considering manual segmentation the ground truth. Lymphoma lesions were characterized by 31 features. Intraclass correlation coefficient (ICC) assessed features agreement between different segmentation methods. Nine hundred twenty [F]FDG-avid lesions were identified. The SAC Bayesian method achieved the highest median intra-observer DC (0.87). Inter-observers' DC was higher for SAC Bayesian than manual segmentation (0.94 vs 0.84, p < 0.001). Semiautomatic deep learning-based median DC was promising (0.83 (Obs1), 0.79 (Obs2)). Threshold-based methods and publicly available 3D U-Net gave poorer results (0.56 ≤ DC ≤ 0.68). Maximum, mean, and peak standardized uptake values, metabolic tumor volume, and total lesion glycolysis showed excellent agreement (ICC ≥ 0.92) between manual and SAC Bayesian segmentation methods. The SAC Bayesian classifier is more reproducible and produces similar lesion features compared to manual segmentation, giving the best concordant results of all other methods. Deep learning-based segmentation can achieve overall good segmentation results but failed in few patients impacting patients' clinical evaluation.
目的是评估七种半自动和两种全自动分割方法在 [F]FDG PET/CT 淋巴瘤图像上的性能,并评估它们对肿瘤定量的影响。使用手动和半自动方法,由两位有经验的观察者对 65 例全身 [F]FDG PET/CT 分期图像中的所有淋巴瘤病变进行分割。使用绝对和相对阈值、k-均值和贝叶斯聚类以及 k-均值和贝叶斯的自适应配置 (SAC) 进行半自动分割。还应用了三种基于深度学习的分割方法,使用 3D U-Net 架构。其中一个是半自动的,另外两个是全自动的,其中一个是公开的。使用手动分割作为ground truth,通过 Dice 系数 (DC) 衡量分割重叠。对淋巴瘤病变进行了 31 种特征分析。采用组内相关系数 (ICC) 评估不同分割方法之间的特征一致性。共识别出 920 个 [F]FDG 摄取病变。SAC 贝叶斯方法获得了最高的观察者内 DC 中位数 (0.87)。SAC 贝叶斯观察者间 DC 高于手动分割 (0.94 与 0.84,p < 0.001)。半自动基于深度学习的 DC 中位数有很大潜力 (0.83 (Obs1),0.79 (Obs2))。基于阈值的方法和公开的 3D U-Net 结果较差 (0.56≤DC≤0.68)。手动和 SAC 贝叶斯分割方法之间,最大、平均和峰值标准化摄取值、代谢肿瘤体积和总病变糖酵解显示出极好的一致性 (ICC≥0.92)。SAC 贝叶斯分类器比手动分割更具可重复性,产生相似的病变特征,与所有其他方法相比,具有最佳的一致性结果。基于深度学习的分割可以实现整体良好的分割结果,但在少数患者中失败,影响患者的临床评估。