Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, United States.
Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, United States.
Radiother Oncol. 2021 Jul;160:185-191. doi: 10.1016/j.radonc.2021.05.003. Epub 2021 May 11.
Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contour variability, which has been associated with poor clinical outcomes and increased efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.
基于人工智能的方法的进步已经导致了许多放射治疗自动分割系统的开发和发布。这些系统有可能减少轮廓变化,而轮廓变化与不良的临床结果和治疗计划工作流程效率的提高有关。然而,目前还没有统一的标准来评估自动分割平台,以评估其在实现这些目标方面的效果。在这里,我们回顾了最常使用的评估技术,包括几何重叠、剂量学参数、轮廓绘制时间和临床评分量表。这些数据表明,许多最常用的几何指标,如 Dice 相似系数,与临床有意义的终点并不相关。因此,需要进行多领域评估,包括综合几何和/或剂量学指标以及医生报告的评估,以衡量自动分割在放射治疗计划中的临床准备情况。