神经网络在宫颈癌近距离治疗中的剂量预测：克服施源器特异性模型的数据匮乏问题。

Neural network dose prediction for cervical brachytherapy: Overcoming data scarcity for applicator-specific models.

机构信息

Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California, USA.

Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego and Moores Cancer Center, La Jolla, California, USA.

出版信息

Med Phys. 2024 Jul;51(7):4591-4606. doi: 10.1002/mp.17230. Epub 2024 May 30.

DOI:10.1002/mp.17230

PMID:38814165

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11309769/

Abstract

BACKGROUND

3D neural network dose predictions are useful for automating brachytherapy (BT) treatment planning for cervical cancer. Cervical BT can be delivered with numerous applicators, which necessitates developing models that generalize to multiple applicator types. The variability and scarcity of data for any given applicator type poses challenges for deep learning.

PURPOSE

The goal of this work was to compare three methods of neural network training-a single model trained on all applicator data, fine-tuning the combined model to each applicator, and individual (IDV) applicator models-to determine the optimal method for dose prediction.

METHODS

Models were produced for four applicator types-tandem-and-ovoid (T&O), T&O with 1-7 needles (T&ON), tandem-and-ring (T&R) and T&R with 1-4 needles (T&RN). First, the combined model was trained on 859 treatment plans from 266 cervical cancer patients treated from 2010 onwards. The train/validation/test split was 70%/16%/14%, with approximately 49%/10%/19%/22% T&O/T&ON/T&R/T&RN in each dataset. Inputs included four channels for anatomical masks (high-risk clinical target volume [HRCTV], bladder, rectum, and sigmoid), a mask indicating dwell position locations, and applicator channels for each applicator component. Applicator channels were created by mapping the 3D dose for a single dwell position to each dwell position and summing over each applicator component with uniform dwell time weighting. A 3D Cascade U-Net, which consists of two U-Nets in sequence, and mean squared error loss function were used. The combined model was then fine-tuned to produce four applicator-specific models by freezing the first U-Net and encoding layers of the second and resuming training on applicator-specific data. Finally, four IDV models were trained using only data from each applicator type. Performance of these three model types was compared using the following metrics for the test set: mean error (ME, representing model bias) and mean absolute error (MAE) over all dose voxels and ME of clinical metrics (HRCTV D90% and D of bladder, rectum, and sigmoid), averaged over all patients. A positive ME indicates the clinical dose was higher than predicted. 3D global gamma analysis with the prescription dose as reference value was performed. Dice similarity coefficients (DSC) were computed for each isodose volume.

RESULTS

Fine-tuned and combined models showed better performance than IDV applicator training. Fine-tuning resulted in modest improvements in about half the metrics, compared to the combined model, while the remainder were mostly unchanged. Fine-tuned MAE = 3.98%/2.69%/5.36%/3.80% for T&O/T&R/T&ON/T&RN, and ME over all voxels = -0.08%/-0.89%/-0.59%/1.42%. ME D were bladder = -0.77%/1.00%/-0.66%/-1.53%, rectum = 1.11%/-0.22%/-0.29%/-3.37%, sigmoid = -0.47%/-0.06%/-2.37%/-1.40%, and ME D90 = 2.6%/-4.4%/4.8%/0.0%. Gamma pass rates (3%/3 mm) were 86%/91%/83%/89%. Mean DSCs were 0.92%/0.92%/0.88%/0.91% for isodoses ≤ 150% of prescription.

CONCLUSIONS

3D BT dose was accurately predicted for all applicator types, as indicated by the low MAE and MEs, high gamma scores and high DSCs. Training on all treatment data overcomes challenges with data scarcity in each applicator type, resulting in superior performance than can be achieved by training on IDV applicators alone. This could presumably be explained by the fact that the larger, more diverse dataset allows the neural network to learn underlying trends and characteristics in dose that are common to all treatment applicators. Accurate, applicator-specific dose predictions could enable automated, knowledge-based planning for any cervical brachytherapy treatment.

摘要

背景

3D 神经网络剂量预测对于宫颈癌近距离治疗（BT）计划的自动化非常有用。宫颈 BT 可以使用多种施源器进行治疗，这就需要开发能够推广到多种施源器类型的模型。对于深度学习来说，任何给定施源器类型的数据的可变性和稀缺性都是一个挑战。

目的

本研究的目的是比较三种神经网络训练方法——在所有施源器数据上训练的单一模型、对每个施源器微调的联合模型和个体（IDV）施源器模型，以确定剂量预测的最佳方法。

方法

为四种施源器类型——T&O（双筒和椭圆）、T&O 带 1-7 针（T&ON）、T&R（双筒和环）和 T&R 带 1-4 针（T&RN）制作了模型。首先，联合模型在 859 个来自 266 名宫颈癌患者的治疗计划上进行训练，这些患者的治疗时间从 2010 年开始。训练/验证/测试的比例为 70%/16%/14%，每个数据集大约有 49%/10%/19%/22%的 T&O/T&ON/T&R/T&RN。输入包括四个解剖掩模（高风险临床靶区[HRCTV]、膀胱、直肠和乙状结肠）、一个表示驻留位置位置的掩模，以及每个施源器组件的施源器通道。施源器通道是通过将单个驻留位置的 3D 剂量映射到每个驻留位置并对每个施源器组件进行求和来创建的，求和时使用均匀驻留时间加权。使用了由两个 U-Net 序列组成的 3D 级联 U-Net 和均方误差损失函数。然后，通过冻结第一个 U-Net 和第二个 U-Net 的编码层，并在施源器特定数据上恢复训练，从而对联合模型进行微调，以产生四个施源器特定的模型。最后，使用每个施源器类型的仅有的数据训练四个 IDV 模型。使用测试集的以下指标比较这三种模型类型的性能：代表模型偏差的平均误差（ME）和所有剂量体素的平均绝对误差（MAE），以及所有患者的 HRCTV D90%和膀胱、直肠和乙状结肠的剂量的平均误差（ME）。正的 ME 表示临床剂量高于预测剂量。使用处方剂量作为参考值进行了 3D 全局伽马分析。计算了每个等剂量体积的 Dice 相似系数（DSC）。

结果

微调后的联合模型和 IDV 施源器训练模型的性能优于 IDV 施源器训练模型。与联合模型相比，微调在大约一半的指标上有了适度的改进，而其余的指标则基本不变。微调后的 MAE（T&O/T&R/T&ON/T&RN）分别为 3.98%/2.69%/5.36%/3.80%，所有体素的 ME（T&O/T&R/T&ON/T&RN）分别为-0.08%/-0.89%/-0.59%/1.42%。膀胱的 ME D 分别为-0.77%/1.00%/-0.66%/-1.53%，直肠为 1.11%/-0.22%/-0.29%/-3.37%，乙状结肠为-0.47%/-0.06%/-2.37%/-1.40%，D90 为 2.6%/-4.4%/4.8%/0.0%。伽马通过率（3%/3mm）分别为 86%/91%/83%/89%。对于≤150%处方的等剂量线，平均 DSCs 分别为 0.92%/0.92%/0.88%/0.91%。

结论

对于所有施源器类型，3D BT 剂量都得到了准确的预测，这可以从低 MAE 和 ME、高伽马评分和高 DSCs 看出。在所有治疗数据上进行训练克服了每个施源器类型数据稀缺的挑战，从而实现了优于单独基于 IDV 施源器训练的性能。这可以解释为，更大、更多样化的数据集允许神经网络学习到适用于所有治疗施源器的剂量的潜在趋势和特征。准确的、特定于施源器的剂量预测可以为任何宫颈癌近距离治疗实现自动化、基于知识的计划。