Lou Xiao, Zhu Juan, Yang Jian, Zhu Youzhe, Shu Huazhong, Li Baosheng
Laboratory of Image Science and Technology, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, Southeast University, Sipailou 2, Nanjing, P.R. China.
Department of Radiotherapy, Lishui People's Hospital, No. 1188, Liyang Street, Lishui, P.R. China.
BMC Med Imaging. 2024 Dec 18;24(1):339. doi: 10.1186/s12880-024-01515-x.
The segmentation of target volume and organs at risk (OAR) was a significant part of radiotherapy. Specifically, determining the location and scale of the esophagus in simulated computed tomography images was difficult and time-consuming primarily due to its complex structure and low contrast with the surrounding tissues. In this study, an Enhanced Cross-stage-attention U-Net was proposed to solve the segmentation problem for the esophageal gross tumor volume (GTV) and clinical tumor volume (CTV) in CT images.
First, a module based on principal component analysis theory was constructed to pre-extract the features of the input image. Then, a cross-stage based feature fusion model was designed to replace the skip concatenation of original UNet, which was composed of Wide Range Attention unit, Small-kernel Local Attention unit, and Inverted Bottleneck unit. WRA was employed to capture global attention, whose large convolution kernel was further decomposed to simplify the calculation. SLA was used to complement the local attention to WRA. IBN was structed to fuse the extracted features, where a global frequency response layer was built to redistribute the frequency response of the fused feature maps.
The proposed method was compared with relevant published esophageal segmentation methods. The prediction of the proposed network was MSD = 2.83(1.62, 4.76)mm, HD = 11.79 ± 6.02 mm, DC = 72.45 ± 19.18% in GTV; MSD = 5.26(2.18, 8.82)mm, HD = 16.22 ± 10.01 mm, DC = 71.06 ± 17.72% in CTV.
The reconstruction of the skip concatenation in UNet showed an improvement of performance for esophageal segmentation. The results showed the proposed network had better effect on esophageal GTV and CTV segmentation.
靶区体积和危及器官(OAR)的分割是放射治疗的重要组成部分。具体而言,在模拟计算机断层扫描图像中确定食管的位置和范围具有难度且耗时,这主要是由于其结构复杂且与周围组织的对比度较低。在本研究中,提出了一种增强跨阶段注意力U-Net来解决CT图像中食管大体肿瘤体积(GTV)和临床靶体积(CTV)的分割问题。
首先,构建基于主成分分析理论的模块对输入图像的特征进行预提取。然后,设计基于跨阶段的特征融合模型来替代原始U-Net的跳跃连接,该模型由宽范围注意力单元、小核局部注意力单元和倒置瓶颈单元组成。宽范围注意力(WRA)用于捕获全局注意力,其大卷积核进一步分解以简化计算。小核局部注意力(SLA)用于补充宽范围注意力的局部注意力。倒置瓶颈(IBN)用于融合提取的特征,其中构建全局频率响应层以重新分配融合特征图的频率响应。
将所提出的方法与已发表的相关食管分割方法进行比较。所提网络在GTV中的预测结果为:平均表面距离(MSD)=2.83(1.62,4.76)mm,豪斯多夫距离(HD)=11.79±6.02mm,骰子系数(DC)=72.45±19.18%;在CTV中的预测结果为:MSD=5.26(2.18,8.82)mm,HD=16.22±10.01mm,DC=71.06±17.72%。
U-Net中跳跃连接的重构在食管分割性能上有所提升。结果表明所提网络在食管GTV和CTV分割上具有更好的效果。