Hochreuter Kim M, Ren Jintao, Nijkamp Jasper, Korreman Stine S, Lukacova Slávka, Kallehauge Jesper F, Trip Anouk K
Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark.
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
Phys Imaging Radiat Oncol. 2024 Aug 5;31:100620. doi: 10.1016/j.phro.2024.100620. eCollection 2024 Jul.
Deep-learning (DL) models for segmentation of the gross tumor volume (GTV) in radiotherapy are generally based on clinical delineations which suffer from inter-observer variability. The aim of this study was to compare performance of a DL-model based on clinical glioblastoma GTVs to a model based on a single-observer edited version of the same GTVs.
The dataset included imaging data (Computed Tomography (CT), T1, contrast-T1 (T1C), and fluid-attenuated-inversion-recovery (FLAIR)) of 259 glioblastoma patients treated with post-operative radiotherapy between 2012 and 2019 at a single institute. The clinical GTVs were edited using all imaging data. The dataset was split into 207 cases for training/validation and 52 for testing.GTV segmentation models (nnUNet) were trained on clinical and edited GTVs separately and compared using Surface Dice with 1 mm tolerance (sDSC). We also evaluated model performance with respect to extent of resection (EOR), and different imaging combinations (T1C/T1/FLAIR/CT, T1C/FLAIR/CT, T1C/FLAIR, T1C/CT, T1C/T1, T1C). A Wilcoxon test was used for significance testing.
The median (range) sDSC of the clinical-GTV-model and edited-GTV-model both evaluated with the edited contours, was 0.76 (0.43-0.94) vs. 0.92 (0.60-0.98) respectively (p < 0.001). sDSC was not significantly different between patients with a biopsy, partial, and complete resection. T1C as single input performed as good as use of imaging combinations.
High segmentation accuracy was obtained by the DL-models. Editing of the clinical GTVs significantly increased DL performance with a relevant effect size. DL performance was robust for EOR and highly accurate using only T1C.
放射治疗中用于大体肿瘤体积(GTV)分割的深度学习(DL)模型通常基于临床勾画,而临床勾画存在观察者间差异。本研究的目的是比较基于临床胶质母细胞瘤GTV的DL模型与基于同一GTV的单观察者编辑版本的模型的性能。
数据集包括2012年至2019年在单一机构接受术后放疗的259例胶质母细胞瘤患者的影像数据(计算机断层扫描(CT)、T1、增强T1(T1C)和液体衰减反转恢复(FLAIR))。使用所有影像数据对临床GTV进行编辑。数据集分为207例用于训练/验证,52例用于测试。GTV分割模型(nnUNet)分别在临床和编辑后的GTV上进行训练,并使用1毫米容差的表面骰子系数(sDSC)进行比较。我们还评估了模型在切除范围(EOR)以及不同影像组合(T1C/T1/FLAIR/CT、T1C/FLAIR/CT、T1C/FLAIR、T1C/CT、T1C/T1、T1C)方面的性能。采用Wilcoxon检验进行显著性检验。
临床GTV模型和编辑后GTV模型均使用编辑后的轮廓进行评估,其sDSC中位数(范围)分别为0.76(0.43 - 0.94)和0.92(0.60 - 0.98)(p < 0.001)。活检、部分切除和完全切除患者之间的sDSC无显著差异。仅使用T1C作为单一输入的表现与使用影像组合的表现一样好。
DL模型获得了较高的分割精度。临床GTV的编辑显著提高了DL性能,且效应量相关。DL性能在EOR方面稳健,仅使用T1C时高度准确。