Scaringi John A, McTaggart Ryan A, Alvin Matthew D, Atalay Michael, Bernstein Michael H, Jayaraman Mahesh V, Jindal Gaurav, Movson Jonathan S, Swenson David W, Baird Grayson L
Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Providence, RI, USA.
Brown Radiology Human Factors Lab, Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Providence, RI, USA.
Eur Radiol. 2024 Dec 31. doi: 10.1007/s00330-024-11332-z.
We report our experience implementing an algorithm for the detection of large vessel occlusion (LVO) for suspected stroke in the emergency setting, including its performance, and offer an explanation as to why it was poorly received by radiologists.
An algorithm was deployed in the emergency room at a single tertiary care hospital for the detection of LVO on CT angiography (CTA) between September 1st-27th, 2021. A retrospective analysis of the algorithm's accuracy was performed.
During the study period, 48 patients underwent CTA examination in the emergency department to evaluate for emergent LVO, with 2 positive cases (60.3 years ± 18.2; 32 women). The LVO algorithm demonstrated a sensitivity and specificity of 100% and 92%, respectively. While the sensitivity of the algorithm at our institution was even higher than the manufacturer's reported values, the false discovery rate was 67%, leading to the perception that the algorithm was inaccurate. In addition, the positive predictive value at our institution was 33% compared with the manufacturer's reported values of 95-98%. This disparity can be attributed to differences in disease prevalence of 4.1% at our institution compared with 45.0-62.2% from the manufacturer's reported values.
Despite the LVO algorithm's accuracy performing as advertised, it was perceived as inaccurate due to more false positives than anticipated and was removed from clinical practice. This was likely due to a cognitive bias called the accuracy paradox. To mitigate the accuracy paradox, radiologists should be presented with metrics based on a disease prevalence similar to their practice when evaluating and utilizing artificial intelligence tools.
Question An artificial intelligence algorithm for detecting emergent LVOs was implemented in an emergency department, but it was perceived to be inaccurate. Findings Although the algorithm's accuracy was both high and as advertised, the algorithm demonstrated a high false discovery rate. Clinical relevance The misperception of the algorithm's inaccuracy was likely due to a special case of the base rate fallacy-the accuracy paradox. Equipping radiologists with an algorithm's false discovery rate based on local prevalence will ensure realistic expectations for real-world performance.
我们报告了在急诊环境中实施一种用于检测疑似中风大血管闭塞(LVO)算法的经验,包括其性能,并解释为何放射科医生对该算法接受度不高。
2021年9月1日至27日,在一家三级护理医院的急诊室部署了一种算法,用于在CT血管造影(CTA)上检测LVO。对该算法的准确性进行了回顾性分析。
在研究期间,48例患者在急诊科接受CTA检查以评估急性LVO,其中2例阳性(60.3岁±18.2;32名女性)。LVO算法的敏感性和特异性分别为100%和92%。虽然该算法在我们机构的敏感性甚至高于制造商报告的值,但错误发现率为67%,导致人们认为该算法不准确。此外,我们机构的阳性预测值为33%,而制造商报告的值为95 - 98%。这种差异可归因于我们机构疾病患病率为4.1%,而制造商报告的值为45.0 - 62.2%。
尽管LVO算法的准确性如宣传的那样,但由于假阳性比预期更多,它被认为不准确并从临床实践中移除。这可能是由于一种称为准确性悖论的认知偏差。为了减轻准确性悖论,在评估和使用人工智能工具时,应向放射科医生提供基于与其实践相似的疾病患病率的指标。
问题 在急诊科实施了一种用于检测急性LVO的人工智能算法,但它被认为不准确。发现 尽管该算法的准确性很高且如宣传的那样,但该算法显示出较高的错误发现率。临床相关性 对该算法不准确的误解可能是由于基础比率谬误的一个特殊情况——准确性悖论。根据当地患病率为放射科医生提供算法的错误发现率,将确保对实际性能有现实的期望。