Dembrower Karin, Salim Mattie, Eklund Martin, Lindholm Peter, Strand Fredrik
Capio S:t Göran Hospital, Department of Radiology, Stockholm, Sweden.
Karolinska Institutet, Department of Oncology and Pathology, Solna, Sweden.
J Med Imaging (Bellingham). 2023 Feb;10(Suppl 2):S22405. doi: 10.1117/1.JMI.10.S2.S22405. Epub 2023 Apr 5.
In double-reading of screening mammograms, artificial intelligence (AI) algorithms hold promise as a potential replacement for one of the two readers. The choice of operating point, or abnormality threshold, for the AI algorithm will affect cancer detection and workload. In our retrospective study, the baseline approach was based on matching stand-alone reader sensitivity, while the alternative approach was based on matching the combined-reader sensitivity of two humans and of AI plus human.
Full-field digital screening mammograms within the Stockholm County area between February 1, 2012, and December 30, 2015, acquired on Philips equipment, were collected. All exams of women with breast cancer within 23 months of screening and a random selection of healthy controls were included. An exam-level continuous AI abnormality score was generated (Insight MMG from Lunit Inc). Sensitivity and abnormal interpretation rates were estimated for operating points defined by the standalone-reader approach and the combined-reader approach.
The study population included 1684 exams of women with breast cancer and 5024 exams of healthy women. Observations of healthy women were up-sampled to attain a realistic proportion of cancer. The observed sensitivity for reader 1, 2 and 1+2 was 69.7%, 75.6%, and 78.6%, respectively, at an abnormal interpretation rate of 4.4%, 4.6%, and 6.1%, respectively. For the combination of reader 1 + AI we estimated a sensitivity of 82.4% for standalone-reader matching and 78.6% for combined-reader matching, at an abnormal interpretation rate of 12.6% and 7.0%, respectively.
Setting the operating point by matching stand-alone AI stand-alone with a radiologist will nearly double the downstream workload compared to a modest increase of 15% for the alternative method of matching sensitivity between the combination of AI and a radiologist with two radiologists.
在乳腺钼靶筛查的双重读片中,人工智能(AI)算法有望替代两名读片医生中的一名。AI算法的操作点(即异常阈值)的选择会影响癌症检测和工作量。在我们的回顾性研究中,基线方法基于匹配独立读片医生的敏感度,而另一种方法则基于匹配两名人类读片医生以及AI与人类组合的联合读片敏感度。
收集了2012年2月1日至2015年12月31日期间在斯德哥尔摩县地区使用飞利浦设备采集的全视野数字化乳腺钼靶筛查图像。纳入了筛查后23个月内患有乳腺癌的所有女性的检查以及随机选择的健康对照。生成了一个检查级别的连续AI异常评分(来自Lunit公司的Insight MMG)。针对独立读片方法和联合读片方法定义的操作点,估计了敏感度和异常解读率。
研究人群包括1684例患有乳腺癌女性的检查和5024例健康女性的检查。对健康女性的观察进行了上采样,以获得现实的癌症比例。在异常解读率分别为4.4%、4.6%和6.1%时,读片医生1、2以及1 + 2的观察敏感度分别为69.7%、75.6%和78.6%。对于读片医生1 + AI的组合,在异常解读率分别为12.6%和7.0%时,独立读片匹配的敏感度估计为82.4%,联合读片匹配的敏感度为78.6%。
与将AI和一名放射科医生的组合与两名放射科医生的组合之间匹配敏感度的另一种方法相比,通过将独立AI与放射科医生匹配来设置操作点会使下游工作量增加近一倍,而另一种方法的工作量仅适度增加15%。