Chen David, Arnold Kristen, Sukhdeo Ronesh, Farag Alla John, Raman Srinivas
Radiation Medicine Program, Princess Margaret Hospital Cancer Centre, Toronto, Ontario, Canada.
Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
BMJ Oncol. 2025 Aug 24;4(1):e000733. doi: 10.1136/bmjonc-2025-000733. eCollection 2025.
The advent of artificial intelligence (AI) tools in oncology to support clinical decision-making, reduce physician workload and automate workflow inefficiencies yields both great promise and caution. To generate high-quality evidence on the safety and efficacy of AI interventions, randomised controlled trials (RCTs) remain the gold standard. However, the completeness and quality of reporting among AI trials in oncology remains unknown.
This systematic review investigates the reporting concordance of RCTs for AI interventions in oncology using the CONSORT (Consolidated Standards of Reporting Trials) 2010 and CONSORT-AI 2020 extension guideline and comprehensively summarises the state of AI RCTs in oncology.
We queried OVID MEDLINE and Embase on 22 October 2024 using AI, cancer and RCT search terms. Studies were included if they reported on an AI intervention in an RCT including participants with cancer.
This study included 57 RCTs of AI interventions in oncology that were primarily focused on screening (54%) or diagnosis (19%) and intended for clinician use (88%). Among all 57 RCTs, median concordance with CONSORT 2010 and CONSORT-AI 2020 was 82%. Compared with trials published before the release of CONSORT-AI (n=8), trials published after the release of CONSORT-AI (n=49) had lower median overall CONSORT (82% vs 92%) and CONSORT 2010 (81% vs 92%) concordance but similar CONSORT-AI median concordance (93% vs 93%). Guideline items related to study methodology necessary for reproducibility using the AI intervention, such as input data inclusion and exclusion, algorithm version, low quality data handling, assessment of performance error and data accessibility, were consistently under-reported. When stratifying included trials by their overall risk of bias, trials at serious risk of bias (57%) were less concordant to CONSORT guidelines compared with trials at moderate (71%) or low (84%) risk of bias.
Although the majority of CONSORT and CONSORT-AI items were well-reported, critical gaps related to reporting of methodology, reproducibility and harms persist. Addressing these gaps through consideration of trial design to mitigate risks of bias coupled with standardised reporting is one step towards responsible adoption of AI to improve patient outcomes in oncology.
人工智能(AI)工具在肿瘤学领域的出现,旨在支持临床决策、减轻医生工作量并自动化工作流程效率低下的问题,这既带来了巨大的希望,也引发了人们的谨慎态度。为了生成关于AI干预安全性和有效性的高质量证据,随机对照试验(RCT)仍然是金标准。然而,肿瘤学领域AI试验的报告完整性和质量仍然未知。
本系统评价使用CONSORT(试验报告统一标准)2010和CONSORT-AI 2020扩展指南,调查肿瘤学中AI干预RCT的报告一致性,并全面总结肿瘤学中AI RCT的现状。
我们于2024年10月22日在OVID MEDLINE和Embase数据库中使用AI、癌症和RCT搜索词进行查询。如果研究报告了在包括癌症患者的RCT中进行的AI干预,则纳入研究。
本研究纳入了57项肿瘤学中AI干预的RCT,这些研究主要集中在筛查(54%)或诊断(19%),并且 intended for clinician use(88%)。在所有57项RCT中,与CONSORT 2010和CONSORT-AI 2020的中位数一致性为82%。与CONSORT-AI发布前发表的试验(n = 8)相比,CONSORT-AI发布后发表的试验(n = 49)的总体CONSORT中位数(82%对92%)和CONSORT 2010中位数(81%对92%)一致性较低,但CONSORT-AI中位数一致性相似(93%对93%)。与使用AI干预进行可重复性研究所需的研究方法相关的指南项目,如输入数据的纳入和排除、算法版本、低质量数据处理、性能误差评估和数据可及性,一直报告不足。当根据纳入试验的总体偏倚风险进行分层时,与中度(71%)或低度(84%)偏倚风险的试验相比,存在严重偏倚风险的试验(57%)与CONSORT指南的一致性较低。
虽然大多数CONSORT和CONSORT-AI项目报告良好,但在方法学报告、可重复性和危害方面仍存在关键差距。通过考虑试验设计以减轻偏倚风险并结合标准化报告来解决这些差距,是负责任地采用AI以改善肿瘤学患者结局的重要一步。
“intended for clinician use”这里原文表述不太完整准确,推测可能是想说“ intended for clinician's use”之类的意思,但按要求未做修改直接翻译了。