Sun Hongfei, Duan Jie, Zhu Jiarui, Li Zhihui, Liu Yufen, Liu Changhao, Li Jie, Shi Zihan, Li Ningning, Gong Jie, Li Xiaokai, Wang Zhongfei, Li Dong, Shi Mei, Zhao Lina
Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China.
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong.
Comput Methods Programs Biomed. 2025 Oct;270:108965. doi: 10.1016/j.cmpb.2025.108965. Epub 2025 Jul 12.
Nasopharyngeal carcinoma (NPC) features uncertain and complex gross tumor volume (GTV) distributions in terms of location, size, and shape. A novel deep learning model with adaptability was proposed to improve the accuracy of automatic GTV delineation for NPC primary and metastatic lesions.
This study comprised 529 retrospective cases and 4 prospective cases from multiple centers, all with CT and MRI modalities. GTV adaptive delineation was achieved via a conditional denoising diffusion model (DDPM) with "inter-modal and intra-modal" attention aware mechanism. During one training epoch, the inter-modal aware mechanism linked the frequency of potentially effective features at identical coordinates across multimodal images to tumor locations. The model progressively focused on high-frequency GTV-related features. Across multiple trainings, the intra-modal aware mechanism established the repeatability of feature extraction within each modality at identical coordinates, enhancing stable feature extraction. By leveraging both mechanisms, the model dynamically calibrated fusion weights for multimodal features, ascertaining each feature's significance in GTV identification based on its positional frequency. The GTV delineation accuracy was assessed using Dice Similarity Coefficient (DSC) ( %), 95 % Hausdorff Distance (HD95 %) (mm), and Mean Surface Distance (MSD) (mm) metrics.
For the internal test set, the adaptive delineation model yielded mean (SD) results of 81.36 ± 2.14 % for DSC, 4.30 ± 2.14 mm for HD95 %, and 3.70 ± 1.84 mm for MSD. The external test set showed corresponding values of 77.43 ± 3.41 %, 6.07 ± 3.10 mm, and 4.31 ± 2.64 mm. Paired T-tests confirmed statistically significant differences between our model and the current SOTA models for automatic GTV delineation. The adaptive delineation model attained DSC accuracies of 83.45 ± 1.75 % for primary tumor and 73.43 ± 5.32 % for metastatic lesions, while achieving HD95 precisions of 4.28 ± 2.08 mm and 7.33 ± 3.45 mm, respectively. The MSD measurements were 2.73 ± 1.70 mm and 5.90 ± 3.18 mm, respectively. These results highlighted better delineation accuracy for primary tumors. In prospective dosimetry validation, the average dose difference within primary tumors, comparing automatically and manually delineated GTVs, was 0.36 Gy-less than the 0.52 Gy difference in lymph node metastases. This aligned with retrospective validation patterns.
The novel deep learning model demonstrated high accuracy and stability GTV automatic delineation for NPC cases, indicating its potential for clinical application in NPC radiotherapy at different centers.
鼻咽癌(NPC)在肿瘤位置、大小和形状方面具有不确定且复杂的大体肿瘤体积(GTV)分布。提出了一种具有适应性的新型深度学习模型,以提高鼻咽癌原发灶和转移灶自动勾画GTV的准确性。
本研究包括来自多个中心的529例回顾性病例和4例前瞻性病例,均具有CT和MRI模态。通过具有“跨模态和模态内”注意力感知机制的条件去噪扩散模型(DDPM)实现GTV自适应勾画。在一个训练周期内,跨模态感知机制将多模态图像中相同坐标处潜在有效特征的频率与肿瘤位置联系起来。该模型逐渐聚焦于与GTV相关的高频特征。在多次训练中,模态内感知机制在每个模态的相同坐标处建立特征提取的可重复性,增强稳定的特征提取。通过利用这两种机制,该模型动态校准多模态特征的融合权重,根据每个特征的位置频率确定其在GTV识别中的重要性。使用骰子相似系数(DSC)(%)、95%豪斯多夫距离(HD95%)(mm)和平均表面距离(MSD)(mm)指标评估GTV勾画准确性。
对于内部测试集,自适应勾画模型的DSC平均(标准差)结果为81.36±2.14%,HD95%为4.30±2.14mm,MSD为3.70±1.84mm。外部测试集的相应值为77.43±3.41%、6.07±3.10mm和4.31±2.64mm。配对T检验证实了我们的模型与当前用于自动勾画GTV的最先进模型之间存在统计学显著差异。自适应勾画模型对原发肿瘤的DSC准确率为83.45±1.75%,对转移灶的准确率为73.43±5.32%,同时HD95精度分别为4.28±2.08mm和7.33±3.45mm。MSD测量值分别为2.73±1.70mm和5.90±3.18mm。这些结果突出了原发肿瘤更好的勾画准确性。在前瞻性剂量学验证中,比较自动和手动勾画的GTV,原发肿瘤内的平均剂量差异为0.36Gy,小于淋巴结转移中的0.52Gy差异。这与回顾性验证模式一致。
新型深度学习模型在鼻咽癌病例中显示出GTV自动勾画的高精度和稳定性,表明其在不同中心鼻咽癌放疗临床应用中的潜力。