Dong Yafei, Marin Thibault, Zhuo Yue, Najem Elie, Beddok Arnaud, Rozenblum Laura, Moteabbed Maryam, Grogg Kira, Xing Fangxu, Woo Jonghye, Chen Yen-Lin E, Lim Ruth, Liu Xiaofeng, Ma Chao, Fakhri Georges El
Yale Biomedical Imaging Institute, Yale University School of Medicine, New Haven, Connecticut, USA.
Department of Radiology and Biomedical Imaging, Yale University School of Medicine, New Haven, Connecticut, USA.
Med Phys. 2025 May 3. doi: 10.1002/mp.17865.
Accurate delineation of the clinical target volume (CTV) is essential in the radiotherapy treatment of soft tissue sarcomas. However, this process is subject to inter-reader variability due to the need for clinical assessment of risk and extent of potential microscopic spread. This can lead to inconsistencies in treatment planning, potentially impacting treatment outcomes. Most existing automatic CTV delineation methods do not account for this variability and can only generate a single CTV for each case.
This study aims to develop a deep learning-based technique to generate multiple CTV contours for each case, simulating the inter-reader variability in the clinical practice.
We employed a publicly available dataset consisting of fluorodeoxyglucose positron emission tomography (FDG-PET), x-ray computed tomography (CT), and pre-contrast T1-weighted magnetic resonance imaging (MRI) scans from 51 patients with soft tissue sarcoma, along with an independent validation set containing five additional patients. An experienced reader drew a contour of the gross tumor volume (GTV) for each patient based on multi-modality images. Subsequently, two additional readers, together with the first one, were responsible for contouring three CTVs in total based on the GTV. We developed a diffusion model-based deep learning method that is capable of generating arbitrary number of different and plausible CTVs to mimic the inter-reader variability in CTV delineation. The proposed model incorporates a separate encoder to extract features from the GTV masks, leveraging the critical role of GTV information in accurate CTV delineation.
The proposed diffusion model demonstrated superior performance with the highest Dice Index (0.902 compared to values below 0.881 for state-of-the-art models) and the best generalized energy distance (GED) (0.209 compared to values exceeding 0.221 for state-of-the-art models). It also achieved the second-highest recall and precision metrics among the compared ambiguous image segmentation models. Results from both datasets exhibited consistent trends, reinforcing the reliability of our findings. Additionally, ablation studies exploring different model structures and input configurations highlighted the significance of incorporating prior GTV information for accurate CTV delineation.
The proposed diffusion model successfully generates multiple plausible CTV contours for soft tissue sarcomas, effectively capturing inter-reader variability in CTV delineation.
在软组织肉瘤的放射治疗中,准确勾画临床靶区(CTV)至关重要。然而,由于需要对潜在微观扩散的风险和范围进行临床评估,这一过程存在读者间的变异性。这可能导致治疗计划不一致,进而可能影响治疗结果。大多数现有的自动CTV勾画方法没有考虑到这种变异性,并且每个病例只能生成一个CTV。
本研究旨在开发一种基于深度学习的技术,为每个病例生成多个CTV轮廓,模拟临床实践中的读者间变异性。
我们使用了一个公开可用的数据集,该数据集包括51例软组织肉瘤患者的氟脱氧葡萄糖正电子发射断层扫描(FDG-PET)、X射线计算机断层扫描(CT)和对比前T1加权磁共振成像(MRI)扫描,以及一个包含另外5例患者的独立验证集。一位经验丰富的读者根据多模态图像为每位患者勾画了大体肿瘤体积(GTV)的轮廓。随后,另外两位读者与第一位读者一起,基于GTV总共勾画了三个CTV。我们开发了一种基于扩散模型的深度学习方法,该方法能够生成任意数量的不同且合理的CTV,以模拟CTV勾画中的读者间变异性。所提出的模型包含一个单独的编码器,用于从GTV掩码中提取特征,利用GTV信息在准确CTV勾画中的关键作用。
所提出的扩散模型表现出卓越的性能,具有最高的Dice指数(0.902,而最先进的模型低于0.881)和最佳的广义能量距离(GED)(0.209,而最先进的模型超过0.221)。在比较的模糊图像分割模型中,它还实现了第二高的召回率和精确率指标。两个数据集的结果都呈现出一致趋势,增强了我们研究结果的可靠性。此外,探索不同模型结构和输入配置的消融研究突出了纳入先前GTV信息对准确CTV勾画的重要性。
所提出的扩散模型成功地为软组织肉瘤生成了多个合理的CTV轮廓,有效地捕捉了CTV勾画中的读者间变异性。