Department of Radiation Oncology, Mayo Clinic Arizona, Phoenix, Arizona, USA.
Department of Radiation Oncology, University of Kentucky, Lexington, Kentucky, USA.
Med Phys. 2023 Jul;50(7):4079-4091. doi: 10.1002/mp.16537. Epub 2023 Jun 7.
Deep learning auto-segmentation (DLAS) models have been adopted in the clinic; however, they suffer from performance deterioration owing to the clinical practice variability. Some commercial DLAS software provide an incremental retraining function that enables users to train a custom model using their institutional data to account for clinical practice variability.
This study was performed to evaluate and implement the commercial DLAS software with the incremental retraining function for definitive treatment of patients with prostate cancer in a multi-user environment.
CT-based target organs and organs-at-risk (OAR) delineation of 215 prostate cancer patients were utilized. The performance of three commercial DLAS software built-in models was validated with 20 patients. A retrained custom model was developed using 100 patients and evaluated on the remaining data (n = 115). Dice similarity coefficient (DSC), Hausdorff distance (HD), mean surface distance (MSD), and surface DSC (SDSC) were utilized for quantitative evaluation. A multi-rater qualitative evaluation was blindly performed with a five-level scale. Visual inspection was performed in consensus and non-consensus unacceptable cases to identify the failure modes.
Three commercial DLAS vendor built-in models achieved sub-optimal performance in 20 patients. The retrained custom model had a mean DSC of 0.82 for prostate, 0.48 for seminal vesicles (SV), and 0.92 for rectum, respectively. This represents a significant improvement over the built-in model with DSC of 0.73, 0.37, and 0.81 for the corresponding structures. Compared to the acceptance rate of 96.5% and consensus unacceptable rate (i.e., both reviewers rated as unacceptable) of 3.5% achieved by manual contours, the custom model achieved a 91.3% acceptance rate and 8.7% consensus unacceptable rate. The failure modes of retrained custom model were attributed to the following: cystogram (n = 2), hip prosthesis (n = 2), low dose rate brachytherapy seeds (n = 2), air in endorectal balloon(n = 1), non-iodinated spacer (n = 2), and giant bladder(n = 1).
The commercial DLAS software with the incremental retraining function was validated and clinically adopted for prostate patients in a multi-user environment. AI-based auto-delineation of the prostate and OARs is shown to achieve improved physician acceptance, overall clinical utility, and accuracy.
深度学习自动分割(DLAS)模型已在临床中得到应用;然而,由于临床实践的变化,它们的性能会下降。一些商业的 DLAS 软件提供了增量重训练功能,使用户能够使用其机构数据训练自定义模型,以适应临床实践的变化。
本研究旨在评估和实施具有增量重训练功能的商业 DLAS 软件,以在多用户环境下对前列腺癌患者进行确定性治疗。
利用 215 例前列腺癌患者的 CT 定位靶器官和危及器官(OAR)勾画数据。使用 20 例患者验证了三种商业 DLAS 软件内置模型的性能。使用 100 例患者开发并评估了一个经过重训练的定制模型(n=115)。使用 Dice 相似系数(DSC)、Hausdorff 距离(HD)、平均表面距离(MSD)和表面 DSC(SDSC)进行定量评估。使用五级量表进行了盲法多评分者定性评估。在共识和非共识不可接受的情况下进行了视觉检查,以确定失败模式。
三种商业 DLAS 供应商的内置模型在 20 例患者中表现不佳。经过重训练的定制模型的前列腺 DSC 平均值为 0.82,精囊 DSC 平均值为 0.48,直肠 DSC 平均值为 0.92。与内置模型对应的 0.73、0.37 和 0.81 相比,这是一个显著的改进。与手动勾画的 96.5%接受率和 3.5%的共识不可接受率(即两位评估者均认为不可接受)相比,定制模型的接受率为 91.3%,共识不可接受率为 8.7%。重训练定制模型的失败模式归因于以下原因:膀胱造瘘术(n=2)、髋关节假体(n=2)、低剂量率近距离放射治疗种子(n=2)、直肠内气囊内空气(n=1)、非碘标记间隔物(n=2)和巨大膀胱(n=1)。
具有增量重训练功能的商业 DLAS 软件已在多用户环境下对前列腺患者进行了验证和临床应用。基于人工智能的前列腺和 OAR 自动勾画被证明可以提高医生的接受度、整体临床实用性和准确性。