Genentech, Inc, South San Francisco, CA.
Department of Hematology, Aalborg University Hospital, Aalborg, Denmark.
J Clin Oncol. 2024 Sep 1;42(25):2966-2977. doi: 10.1200/JCO.23.01978. Epub 2024 Jun 6.
Artificial intelligence can reduce the time used by physicians on radiological assessments. For F-fluorodeoxyglucose-avid lymphomas, obtaining complete metabolic response (CMR) by end of treatment is prognostic.
Here, we present a deep learning-based algorithm for fully automated treatment response assessments according to the Lugano 2014 classification. The proposed four-stage method, trained on a multicountry clinical trial (ClinicalTrials.gov identifier: NCT01287741) and tested in three independent multicenter and multicountry test sets on different non-Hodgkin lymphoma subtypes and different lines of treatment (ClinicalTrials.gov identifiers NCT02257567, NCT02500407; 20% holdout in ClinicalTrials.gov identifier NCT01287741), outputs the detected lesions at baseline and follow-up to enable focused radiologist review.
The method's response assessment achieved high agreement with the adjudicated radiologic responses (eg, agreement for overall response assessment of 93%, 87%, and 85% in ClinicalTrials.gov identifiers NCT01287741, NCT02500407, and NCT02257567, respectively) similar to inter-radiologist agreement and was strongly prognostic of outcomes with a trend toward higher accuracy for death risk than adjudicated radiologic responses (hazard ratio for end of treatment by-model CMR of 0.123, 0.054, and 0.205 in ClinicalTrials.gov identifiers NCT01287741, NCT02500407, and NCT02257567, compared with, respectively, 0.226, 0.292, and 0.272 for CMR by the adjudicated responses). Furthermore, a radiologist review of the algorithm's assessments was conducted. The radiologist median review time was 1.38 minutes/assessment, and no statistically significant differences were observed in the level of agreement of the radiologist with the model's response compared with the level of agreement of the radiologist with the adjudicated responses.
These results suggest that the proposed method can be incorporated into radiologic response assessment workflows in cancer imaging for significant time savings and with performance similar to trained medical experts.
人工智能可以减少医生在放射学评估上所花费的时间。对于 F-氟代脱氧葡萄糖阳性的淋巴瘤,治疗结束时获得完全代谢缓解(CMR)是具有预后意义的。
在此,我们提出了一种基于深度学习的算法,用于根据卢加诺 2014 分类进行完全自动的治疗反应评估。该四阶段方法是在一项多国家临床试验(ClinicalTrials.gov 标识符:NCT01287741)上进行训练,并在三个独立的多中心和多国家测试集中进行测试,这些测试集涉及不同的非霍奇金淋巴瘤亚型和不同的治疗线(ClinicalTrials.gov 标识符 NCT02257567、NCT02500407;ClinicalTrials.gov 标识符 NCT01287741 中有 20%的保留数据),该方法输出在基线和随访时检测到的病变,以实现放射科医生的重点审查。
该方法的反应评估与裁定的放射学反应具有高度一致性(例如,在 ClinicalTrials.gov 标识符 NCT01287741、NCT02500407 和 NCT02257567 中,整体反应评估的一致性分别为 93%、87%和 85%),与放射科医生之间的一致性相似,并且对结局具有强烈的预后意义,其死亡风险的准确性趋势高于裁定的放射学反应(在 ClinicalTrials.gov 标识符 NCT01287741、NCT02500407 和 NCT02257567 中,与裁定的放射学反应相比,模型的治疗结束时 CMR 的风险比分别为 0.123、0.054 和 0.205)。此外,还对算法的评估进行了放射科医生的审核。放射科医生的中位数审核时间为 1.38 分钟/次评估,与裁定的放射学反应相比,放射科医生对模型反应的一致性与放射科医生对裁定的放射学反应的一致性没有统计学上的显著差异。
这些结果表明,所提出的方法可以纳入癌症成像中的放射学反应评估工作流程,从而显著节省时间,且具有与训练有素的医学专家相似的性能。