Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, USA.
Med Phys. 2022 Aug;49(8):5244-5257. doi: 10.1002/mp.15765. Epub 2022 Jun 8.
Fast and accurate multiorgans segmentation from computed tomography (CT) scans is essential for radiation treatment planning. Self-attention(SA)-based deep learning methodologies provide higher accuracies than standard methods but require memory and computationally intensive calculations, which restricts their use to relatively shallow networks.
Our goal was to develop and test a new computationally fast and memory-efficient bidirectional SA method called nested block self-attention (NBSA), which is applicable to shallow and deep multiorgan segmentation networks.
A new multiorgan segmentation method combining a deep multiple resolution residual network with computationally efficient SA called nested block SA (MRRN-NBSA) was developed and evaluated to segment 18 different organs from head and neck (HN) and abdomen organs. MRRN-NBSA combines features from multiple image resolutions and feature levels with SA to extract organ-specific contextual features. Computational efficiency is achieved by using memory blocks of fixed spatial extent for SA calculation combined with bidirectional attention flow. Separate models were trained for HN (n = 238) and abdomen (n = 30) and tested on set aside open-source grand challenge data sets for HN (n = 10) using a public domain database of computational anatomy and blinded testing on 20 cases from Beyond the Cranial Vault data set with overall accuracy provided by the grand challenge website for abdominal organs. Robustness to two-rater segmentations was also evaluated for HN cases using the open-source data set. Statistical comparison of MRRN-NBSA against Unet, convolutional network-based SA using criss-cross attention (CCA), dual SA, and transformer-based (UNETR) methods was done by measuring the differences in the average Dice similarity coefficient (DSC) accuracy for all HN organs using the Kruskall-Wallis test, followed by individual method comparisons using paired, two-sided Wilcoxon-signed rank tests at 95% confidence level with Bonferroni correction used for multiple comparisons.
MRRN-NBSA produced an average high DSC of 0.88 for HN and 0.86 for the abdomen that exceeded current methods. MRRN-NBSA was more accurate than the computationally most efficient CCA (average DSC of 0.845 for HN, 0.727 for abdomen). Kruskal-Wallis test showed significant difference between evaluated methods (p=0.00025). Pair-wise comparisons showed significant differences between MRRN-NBSA than Unet (p=0.0003), CCA (p=0.030), dual (p=0.038), and UNETR methods (p=0.012) after Bonferroni correction. MRRN-NBSA produced less variable segmentations for submandibular glands (0.82 ± 0.06) compared to two raters (0.75 ± 0.31).
MRRN-NBSA produced more accurate multiorgan segmentations than current methods on two different public data sets. Testing on larger institutional cohorts is required to establish feasibility for clinical use.
从计算机断层扫描(CT)中快速准确地分割多个器官对于放射治疗计划至关重要。基于自注意力(SA)的深度学习方法比标准方法提供更高的准确性,但需要内存和计算密集型计算,这限制了它们在相对较浅的网络中的使用。
我们的目标是开发和测试一种新的计算快速且内存高效的双向 SA 方法,称为嵌套块自注意力(NBSA),它适用于浅层和深层多器官分割网络。
开发并评估了一种新的多器官分割方法,该方法将深度多分辨率残差网络与称为嵌套块 SA(MRRN-NBSA)的计算高效 SA 相结合,用于分割头颈部(HN)和腹部器官的 18 个不同器官。MRRN-NBSA 结合了来自多个图像分辨率和特征级别的特征,并使用 SA 提取器官特定的上下文特征。通过使用具有固定空间范围的内存块进行 SA 计算并结合双向注意力流,实现了计算效率。针对 HN(n=238)和腹部(n=30)分别训练了单独的模型,并使用公共领域计算解剖数据库在 HN 上的预留开源大挑战数据集(n=10)上进行测试,并为腹部器官提供大挑战网站提供的整体准确性。还使用开源数据集评估了 HN 病例的双评分分割的稳健性。使用 Kruskal-Wallis 检验测量所有 HN 器官的平均 Dice 相似系数(DSC)准确性差异,然后使用配对的双边 Wilcoxon 符号秩检验对每个方法进行比较,置信水平为 95%,并使用 Bonferroni 校正进行多次比较,对 MRRN-NBSA 与 Unet、基于交叉注意力的卷积网络 SA(CCA)、双 SA 和基于转换器的(UNETR)方法进行了比较。
MRRN-NBSA 对头颈部的平均高 DSC 为 0.88,对腹部的平均 DSC 为 0.86,超过了当前方法。MRRN-NBSA 比计算效率最高的 CCA(HN 的平均 DSC 为 0.845,腹部的平均 DSC 为 0.727)更准确。Kruskal-Wallis 检验显示评估方法之间存在显著差异(p=0.00025)。成对比较显示,MRRN-NBSA 与 Unet(p=0.0003)、CCA(p=0.030)、双(p=0.038)和 UNETR 方法(p=0.012)之间存在显著差异,经 Bonferroni 校正后。MRRN-NBSA 对颌下腺的分割更稳定(0.82±0.06),而不是两名评分者(0.75±0.31)。
MRRN-NBSA 在两个不同的公共数据集上产生了比当前方法更准确的多器官分割。需要在更大的机构队列中进行测试,以确定其临床应用的可行性。