Department of Oncology-Pathology, Karolinska Institute, Stockholm, Sweden.
Department of Radiology, Karolinska University Hospital, Stockholm, Sweden.
JAMA Oncol. 2020 Oct 1;6(10):1581-1588. doi: 10.1001/jamaoncol.2020.3321.
A computer algorithm that performs at or above the level of radiologists in mammography screening assessment could improve the effectiveness of breast cancer screening.
To perform an external evaluation of 3 commercially available artificial intelligence (AI) computer-aided detection algorithms as independent mammography readers and to assess the screening performance when combined with radiologists.
DESIGN, SETTING, AND PARTICIPANTS: This retrospective case-control study was based on a double-reader population-based mammography screening cohort of women screened at an academic hospital in Stockholm, Sweden, from 2008 to 2015. The study included 8805 women aged 40 to 74 years who underwent mammography screening and who did not have implants or prior breast cancer. The study sample included 739 women who were diagnosed as having breast cancer (positive) and a random sample of 8066 healthy controls (negative for breast cancer).
Positive follow-up findings were determined by pathology-verified diagnosis at screening or within 12 months thereafter. Negative follow-up findings were determined by a 2-year cancer-free follow-up. Three AI computer-aided detection algorithms (AI-1, AI-2, and AI-3), sourced from different vendors, yielded a continuous score for the suspicion of cancer in each mammography examination. For a decision of normal or abnormal, the cut point was defined by the mean specificity of the first-reader radiologists (96.6%).
The median age of study participants was 60 years (interquartile range, 50-66 years) for 739 women who received a diagnosis of breast cancer and 54 years (interquartile range, 47-63 years) for 8066 healthy controls. The cases positive for cancer comprised 618 (84%) screen detected and 121 (16%) clinically detected within 12 months of the screening examination. The area under the receiver operating curve for cancer detection was 0.956 (95% CI, 0.948-0.965) for AI-1, 0.922 (95% CI, 0.910-0.934) for AI-2, and 0.920 (95% CI, 0.909-0.931) for AI-3. At the specificity of the radiologists, the sensitivities were 81.9% for AI-1, 67.0% for AI-2, 67.4% for AI-3, 77.4% for first-reader radiologist, and 80.1% for second-reader radiologist. Combining AI-1 with first-reader radiologists achieved 88.6% sensitivity at 93.0% specificity (abnormal defined by either of the 2 making an abnormal assessment). No other examined combination of AI algorithms and radiologists surpassed this sensitivity level.
To our knowledge, this study is the first independent evaluation of several AI computer-aided detection algorithms for screening mammography. The results of this study indicated that a commercially available AI computer-aided detection algorithm can assess screening mammograms with a sufficient diagnostic performance to be further evaluated as an independent reader in prospective clinical trials. Combining the first readers with the best algorithm identified more cases positive for cancer than combining the first readers with second readers.
在乳房 X 光筛查评估中表现与放射科医生水平相当或更高的计算机算法可以提高乳腺癌筛查的效果。
对 3 种商业人工智能 (AI) 计算机辅助检测算法作为独立的乳房 X 光筛查读者进行外部评估,并评估与放射科医生联合使用时的筛查性能。
设计、设置和参与者:这是一项基于瑞典斯德哥尔摩一家学术医院的双读者基于人群的乳房 X 光筛查队列的回顾性病例对照研究,该队列的筛查人群为 2008 年至 2015 年期间年龄在 40 至 74 岁之间的女性。该研究包括 8805 名接受乳房 X 光筛查且无植入物或既往乳腺癌的女性。研究样本包括 739 名被诊断患有乳腺癌(阳性)的女性和 8066 名健康对照组(未患乳腺癌)的随机样本。
阳性随访结果由筛查或此后 12 个月内的病理证实的诊断确定。阴性随访结果由 2 年无癌症随访确定。三种来自不同供应商的 AI 计算机辅助检测算法(AI-1、AI-2 和 AI-3)对每次乳房 X 光检查的癌症可疑程度给出了连续评分。对于正常或异常的决策,切点由第一读者放射科医生的特异性平均值(96.6%)定义。
研究参与者的中位年龄为 739 名诊断为乳腺癌的女性为 60 岁(四分位距为 50-66 岁),8066 名健康对照组的中位年龄为 54 岁(四分位距为 47-63 岁)。癌症阳性病例中,618 例(84%)为筛查发现,121 例(16%)为 12 个月内临床发现。AI-1 的癌症检测曲线下面积为 0.956(95%CI,0.948-0.965),AI-2 为 0.922(95%CI,0.910-0.934),AI-3 为 0.920(95%CI,0.909-0.931)。在放射科医生的特异性下,AI-1 的敏感度为 81.9%,AI-2 为 67.0%,AI-3 为 67.4%,第一读者放射科医生为 77.4%,第二读者放射科医生为 80.1%。将 AI-1 与第一读者放射科医生结合使用,在 93.0%的特异性下达到 88.6%的敏感度(异常定义为两个中的任意一个做出异常评估)。没有其他检查的 AI 算法和放射科医生的组合超过了这一敏感度水平。
据我们所知,这是首次对几种用于乳房 X 光筛查的 AI 计算机辅助检测算法进行独立评估。这项研究的结果表明,一种商业上可用的 AI 计算机辅助检测算法可以对筛查性乳房 X 光片进行足够的诊断性能评估,以便在未来的临床试验中作为独立的读者进一步评估。与第二读者相比,将第一读者与最佳算法相结合可以发现更多的癌症阳性病例。