Department of Radiation Oncology, Moffitt Cancer Center, Tampa, FL, USA; Department of Physics, University of South Florida, FL, USA.
Department of Radiation Oncology, Moffitt Cancer Center, Tampa, FL, USA.
Radiother Oncol. 2022 Sep;174:52-58. doi: 10.1016/j.radonc.2022.06.024. Epub 2022 Jul 9.
To introduce and validate a newly developed deep-learning (DL) auto-segmentation algorithm for head and neck (HN) organs at risk (OARs) and to compare its performance with a published commercial algorithm.
A total of 864 HN cancer cases were available to train and evaluate a prototype algorithm. The algorithm is based on a fully convolutional network with combined U-Net and V-net. A Dice loss plus Cross-Entropy Loss function with Adam optimizer was used in training. For 75 validation cases, OAR sets were generated with three DL-based models (A: the prototype model trained with gold data, B: a commercial software trained with the same data, and C: the same software trained with data from another institution). The auto-segmented structures were evaluated with Dice similarity coefficient (DSC), Hausdorff distance (HD), voxel-penalty metric (VPM) and DSC of area under dose-volume histograms. A subjective qualitative evaluation was performed on 20 random cases.
Overall trend was for the prototype algorithm to be the closest to the gold data by all five metrics. The average DSC/VPM/HD for algorithms A, B, and C were 0.81/84.1/1.6 mm, 0.74/62.8/3.2 mm, and 0.66/46.8/3.3 mm, respectively. 93% of model A structures were evaluated to be clinically useful.
The superior performance of the prototype was validated, even when trained with the same data. In addition to the challenges of perfecting the algorithms, the auto-segmentation results can differ when the same algorithm is trained at different institutions.
介绍并验证一种新开发的用于头颈部(HN)危及器官(OAR)的深度学习(DL)自动分割算法,并将其性能与已发表的商业算法进行比较。
共有 864 例 HN 癌症病例可用于训练和评估原型算法。该算法基于具有联合 U-Net 和 V-net 的全卷积网络。在训练中使用了带有 Adam 优化器的 Dice 损失加交叉熵损失函数。对于 75 个验证病例,使用三个基于 DL 的模型(A:用金标准数据训练的原型模型,B:用相同数据训练的商业软件,C:用来自另一个机构的数据训练的相同软件)生成 OAR 集。使用 Dice 相似系数(DSC)、Hausdorff 距离(HD)、体素惩罚度量(VPM)和剂量-体积直方图下的 DSC 评估自动分割结构。对 20 个随机病例进行了主观定性评估。
总体趋势是所有五个指标都表明原型算法最接近金标准数据。算法 A、B 和 C 的平均 DSC/VPM/HD 分别为 0.81/84.1/1.6mm、0.74/62.8/3.2mm 和 0.66/46.8/3.3mm。93%的模型 A 结构被评估为具有临床意义。
即使使用相同的数据进行训练,也验证了原型的优越性能。除了完善算法的挑战外,当在不同机构训练相同的算法时,自动分割结果可能会有所不同。