Zhu Jiahua, Cui Taoran, Zhang Yin, Zhang Yang, Ma Chi, Liu Bo, Nie Ke, Yue Ning J, Wang Xiao
Department of Radiation Oncology, Rutgers-Cancer Institute of New Jersey, Rutgers-Robert Wood Johnson Medical School, New Brunswick, NJ, United States.
Department of Radiation Oncology, Reading Hospital, Tower Health, West Reading, PA, United States.
Front Oncol. 2022 Jan 31;11:756503. doi: 10.3389/fonc.2021.756503. eCollection 2021.
The beam output of a double scattering proton system varies for each combination of beam option, range, and modulation and therefore is difficult to be accurately modeled by the treatment planning system (TPS). This study aims to design an empirical method using the analytical and machine learning (ML) models to estimate proton output in a double scattering proton system.
Three analytical models using polynomial, linear, and logarithm-polynomial equations were generated on a training dataset consisting of 1,544 clinical measurements to estimate proton output for each option. Meanwhile, three ML models using Gaussian process regression (GPR) with exponential kernel, squared exponential kernel, and rational quadratic kernel were also created for all options combined. The accuracy of each model was validated against 241 additional clinical measurements as the testing dataset. Two most robust models were selected, and the minimum number of samples needed for either model to achieve sufficient accuracy ( ± 3%) was determined by evaluating the mean average percentage error (MAPE) with increasing sample number. The differences between the estimated outputs using the two models were also compared for 1,000 proton beams with a randomly generated range, and modulation for each option.
The polynomial model and the ML GPR model with exponential kernel yielded the most accurate estimations with less than 3% deviation from the measured outputs. At least 20 samples of each option were needed to build the polynomial model with less than 1% MAPE, whereas at least a total of 400 samples were needed for all beam options to build the ML GPR model with exponential kernel to achieve comparable accuracy. The two independent models agreed with less than 2% deviation using the testing dataset.
The polynomial model and the ML GPR model with exponential kernel were built for proton output estimation with less than 3% deviations from the measurements. They can be used as an independent output prediction tool for a double scattering proton beam and a secondary output check tool for a cross check between themselves.
双散射质子系统的束流输出会因束流选项、射程和调制的每种组合而有所不同,因此治疗计划系统(TPS)难以对其进行精确建模。本研究旨在设计一种使用分析模型和机器学习(ML)模型的经验方法,以估计双散射质子系统中的质子输出。
在一个由1544次临床测量组成的训练数据集上,生成了使用多项式、线性和对数多项式方程的三种分析模型,以估计每个选项的质子输出。同时,还针对所有组合选项创建了三种使用具有指数核、平方指数核和有理二次核的高斯过程回归(GPR)的ML模型。每个模型的准确性通过另外241次临床测量作为测试数据集进行验证。选择了两个最稳健的模型,并通过评估随着样本数量增加的平均平均百分比误差(MAPE),确定每个模型达到足够准确性(±3%)所需的最少样本数量。还比较了使用这两个模型对1000束随机生成射程和每个选项调制的质子束的估计输出之间的差异。
多项式模型和具有指数核的ML GPR模型产生了最准确的估计,与测量输出的偏差小于3%。构建MAPE小于1%的多项式模型,每个选项至少需要20个样本,而构建具有指数核的ML GPR模型以达到可比的准确性,所有束流选项总共至少需要400个样本。使用测试数据集时,这两个独立模型的偏差小于2%。
构建了多项式模型和具有指数核的ML GPR模型用于质子输出估计,与测量值的偏差小于3%。它们可作为双散射质子束的独立输出预测工具,以及相互之间进行交叉检查的二次输出检查工具。