Wahid Kareem A, Kaffey Zaphanlene Y, Farris David P, Humbert-Vidan Laia, Moreno Amy C, Rasmussen Mathis, Ren Jintao, Naser Mohamed A, Netherton Tucker J, Korreman Stine, Balakrishnan Guha, Fuller Clifton D, Fuentes David, Dohopolski Michael J
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
medRxiv. 2024 May 13:2024.05.13.24307226. doi: 10.1101/2024.05.13.24307226.
BACKGROUND/PURPOSE: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions.
We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics.
We identified 56 articles published from 2015-2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50%), followed by image-synthesis (13%), and multiple applications simultaneously (11%). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32%). Imaging data was used in 91% of studies, while only 13% incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60%), with Monte Carlo dropout being the most commonly implemented UQ method (32%) followed by ensembling (16%). 55% of studies did not share code or datasets.
Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, there was a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
背景/目的:人工智能(AI)在放射治疗(RT)中的应用正在迅速扩展。然而,临床医生对AI模型明显缺乏信任,这凸显了有效不确定性量化(UQ)方法的必要性。本研究的目的是梳理与RT中UQ相关的现有文献,确定改进领域,并确定未来方向。
我们遵循PRISMA-ScR范围综述报告指南。我们利用人群(人类癌症患者)、概念(AI UQ的应用)、背景(放射治疗应用)框架来构建我们的搜索和筛选过程。我们进行了一项系统搜索,涵盖七个数据库,并辅以人工筛选,截至2024年1月。我们的搜索共产生8980篇文章用于初步审查。在Covidence中进行稿件筛选和数据提取。数据提取类别包括一般研究特征、RT特征、AI特征和UQ特征。
我们确定了2015年至2024年发表的56篇文章。代表了10个RT应用领域;大多数研究评估了自动轮廓勾画(50%),其次是图像合成(13%),以及同时进行多种应用(11%)。代表了12个疾病部位,头颈部癌是最常见的疾病部位,与应用领域无关(32%)。91%的研究使用了成像数据,而只有13%纳入了RT剂量信息。大多数研究将故障检测作为UQ的主要应用(60%),蒙特卡罗随机失活是最常用的UQ方法(32%),其次是集成方法(16%)。55%的研究未共享代码或数据集。
我们的综述表明,除自动轮廓勾画外,RT应用中UQ缺乏多样性。此外,显然需要研究其他UQ方法,如共形预测。我们的结果可能会促使制定RT中UQ报告和实施的指南。