Warwick Screening & Warwick Evidence, Warwick Medical School, University of Warwick, Coventry, UK.
Population Health Science, University of Bristol, Bristol, UK.
Thorax. 2024 Oct 16;79(11):1040-1049. doi: 10.1136/thorax-2024-221662.
To examine the accuracy and impact of artificial intelligence (AI) software assistance in lung cancer screening using CT.
A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS-2 and QUADAS-C were used to assess risk of bias. We undertook narrative synthesis.
Eleven studies evaluating six different AI-based software and reporting on 19 770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared with unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (-7% to -3% for correctly detecting/categorising people without actionable nodules; -8% to -6% for correctly detecting/categorising people without malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150-750 cancers detected per million people attending screening but lead to an additional 59 700 to 79 600 people attending screening without cancer receiving unnecessary CT surveillance.
AI assistance in lung cancer screening may improve sensitivity but increases the number of false-positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design.
CRD42021298449.
检查人工智能(AI)软件辅助 CT 肺癌筛查的准确性和影响。
对经 CE 标记的、用于 CT 肺癌筛查中自动检测和分析结节的 AI 软件进行了系统评价。从 2012 年到 2023 年 3 月,在多个数据库(包括 Medline、Embase 和 Cochrane CENTRAL)中进行了搜索。纳入了主要报告检测准确性或对阅读时间或临床管理影响的研究。使用 QUADAS-2 和 QUADAS-C 评估偏倚风险。我们进行了叙述性综合。
有 11 项研究评估了 6 种不同的基于 AI 的软件,共涉及 19770 名患者,均存在高偏倚风险,且存在多种适用性问题。与非辅助阅读相比,AI 辅助阅读速度更快,通常可以提高敏感性(检测/分类可操作结节的敏感度提高 5%至 20%;检测/分类恶性结节的敏感度提高 3%至 15%),特异性降低(正确检测/分类无可操作结节的特异性降低 7%至 3%;正确检测/分类无恶性结节的特异性降低 8%至 6%)。AI 辅助通常会增加被分配到更高风险类别的结节比例。假设癌症患病率为 0.5%,这将导致每百万参加筛查的人中额外检出 150 至 750 例癌症,但也会导致额外 59600 至 79600 例无癌症的人接受不必要的 CT 监测。
AI 辅助肺癌筛查可能会提高敏感性,但会增加假阳性结果和不必要监测的数量。未来的研究需要通过改进研究设计来提高 AI 辅助阅读的特异性,并降低偏倚和适用性问题的风险。
PROSPERO 注册号:CRD42021298449。