Beasley William J, McWilliam Alan, Aitkenhead Adam, Mackay Ranald I, Rowbottom Carl G
University of Manchester; The Christie NHS Foundation Trust.
J Appl Clin Med Phys. 2016 Mar 8;17(2):41-49. doi: 10.1120/jacmp.v17i2.5889.
Contouring structures in the head and neck is time-consuming, and automatic seg-mentation is an important part of an adaptive radiotherapy workflow. Geometric accuracy of automatic segmentation algorithms has been widely reported, but there is no consensus as to which metrics provide clinically meaningful results. This study investigated whether geometric accuracy (as quantified by several commonly used metrics) was associated with dosimetric differences for the parotid and larynx, comparing automatically generated contours against manually drawn ground truth contours. This enabled the suitability of different commonly used metrics to be assessed for measuring automatic segmentation accuracy of the parotid and larynx. Parotid and larynx structures for 10 head and neck patients were outlined by five clinicians to create ground truth structures. An automatic segmentation algorithm was used to create automatically generated normal structures, which were then used to create volumetric-modulated arc therapy plans. The mean doses to the automatically generated structures were compared with those of the corresponding ground truth structures, and the relative difference in mean dose was calculated for each structure. It was found that this difference did not correlate with the geometric accuracy provided by several metrics, notably the Dice similarity coefficient, which is a commonly used measure of spatial overlap. Surface-based metrics provided stronger correlation and are, therefore, more suitable for assessing automatic seg-mentation of the parotid and larynx.
对头颈部结构进行轮廓勾画非常耗时,而自动分割是自适应放射治疗工作流程的重要组成部分。自动分割算法的几何准确性已有广泛报道,但对于哪些指标能提供具有临床意义的结果尚无共识。本研究调查了几何准确性(由几个常用指标量化)是否与腮腺和喉的剂量差异相关,将自动生成的轮廓与手动绘制的真实轮廓进行比较。这使得能够评估不同常用指标在测量腮腺和喉自动分割准确性方面的适用性。五名临床医生勾勒出10名头颈部患者的腮腺和喉结构,以创建真实结构。使用自动分割算法创建自动生成的正常结构,然后用于创建容积调强弧形治疗计划。将自动生成结构的平均剂量与相应真实结构的平均剂量进行比较,并计算每个结构的平均剂量相对差异。结果发现,这种差异与几个指标提供的几何准确性不相关,特别是常用的空间重叠度量指标——骰子相似系数。基于表面的指标相关性更强,因此更适合评估腮腺和喉的自动分割。