Fernandes Miguel Garrett, Bussink Johan, Wijsman Robin, Stam Barbara, Monshouwer René
Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands.
Department of Radiation Oncology, University Medical Center Groningen, Groningen, The Netherlands.
Phys Imaging Radiat Oncol. 2024 Jan 4;29:100533. doi: 10.1016/j.phro.2024.100533. eCollection 2024 Jan.
Normal tissue complication probability (NTCP) models are developed from large retrospective datasets where automatic contouring is often used to contour the organs at risk. This study proposes a methodology to estimate how discrepancies between two sets of contours are reflected on NTCP model performance. We apply this methodology to heart contours within a dataset of non-small cell lung cancer (NSCLC) patients.
One of the contour sets is designated the ground truth and a dosimetric parameter derived from it is used to simulate outcomes via a predefined NTCP relationship. For each simulated outcome, the selected dosimetric parameters associated with each contour set are individually used to fit a toxicity model and their performance is compared. Our dataset comprised 605 stage IIA-IIIB NSCLC patients. Manual, deep learning, and atlas-based heart contours were available.
How contour differences were reflected in NTCP model performance depended on the slope of the predefined model, the dosimetric parameter utilized, and the size of the cohort. The impact of contour differences on NTCP model performance increased with steeper NTCP curves. In our dataset, parameters on the low range of the dose-volume histogram were more robust to contour differences.
Our methodology can be used to estimate whether a given contouring model is fit for NTCP model development. For the heart in comparable datasets, average Dice should be at least as high as between our manual and deep learning contours for shallow NTCP relationships (88.5 ± 4.5 %) and higher for steep relationships.
正常组织并发症概率(NTCP)模型是根据大型回顾性数据集开发的,在这些数据集中,自动轮廓勾画常用于对危及器官进行轮廓勾画。本研究提出一种方法,以估计两组轮廓之间的差异如何反映在NTCP模型性能上。我们将此方法应用于非小细胞肺癌(NSCLC)患者数据集中的心脏轮廓。
其中一组轮廓被指定为真实轮廓,并通过预定义的NTCP关系,使用从中导出的剂量学参数来模拟结果。对于每个模拟结果,分别使用与每个轮廓集相关的选定剂量学参数来拟合毒性模型,并比较它们的性能。我们的数据集包括605例IIA-IIIB期NSCLC患者。有手动、深度学习和基于图谱的心脏轮廓。
轮廓差异如何反映在NTCP模型性能上,取决于预定义模型的斜率、所使用的剂量学参数以及队列规模。轮廓差异对NTCP模型性能的影响随着NTCP曲线变陡而增加。在我们的数据集中,剂量体积直方图低范围的参数对轮廓差异更具鲁棒性。
我们的方法可用于估计给定的轮廓勾画模型是否适合NTCP模型开发。对于可比数据集中的心脏,对于浅NTCP关系,平均骰子系数应至少与我们手动和深度学习轮廓之间的一样高(88.5±4.5%),对于陡关系则更高。