Gibson Eli, Georgescu Bogdan, Ceccaldi Pascal, Trigan Pierre-Hugo, Yoo Youngjin, Das Jyotipriya, Re Thomas J, Rs Vishwanath, Balachandran Abishek, Eibenberger Eva, Chekkoury Andrei, Brehm Barbara, Bodanapally Uttam K, Nicolaou Savvas, Sanelli Pina C, Schroeppel Thomas J, Flohr Thomas, Comaniciu Dorin, Lui Yvonne W
Department of Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (E.G., B.G., P.C., P.H.T., Y.Y., J.D., T.J.R., D.C.); Department of Digital Technology and Innovation, Siemens Healthineers, Bangalore, India (V.R.S., A.B.); Department of Computed Tomography, Siemens Healthineers, Forchheim, Germany (E.E., A.C., B.B., T.F.); Department of Radiology, University of Maryland Medical Center, Baltimore, Md (U.K.B.); Department of Radiology, Vancouver General Hospital, Vancouver, Canada (S.N.); Department of Radiology, Northwell Health, New York, NY (P.C.S.); Department of Surgery, UCHealth Memorial Hospital, Colorado Springs, Colo (T.J.S.); and Department of Radiology, NYU Langone Health, New York University School of Medicine, New York, NY (Y.W.L.).
Radiol Artif Intell. 2022 Apr 20;4(3):e210115. doi: 10.1148/ryai.210115. eCollection 2022 May.
To present a method that automatically detects, subtypes, and locates acute or subacute intracranial hemorrhage (ICH) on noncontrast CT (NCCT) head scans; generates detection confidence scores to identify high-confidence data subsets with higher accuracy; and improves radiology worklist prioritization. Such scores may enable clinicians to better use artificial intelligence (AI) tools.
This retrospective study included 46 057 studies from seven "internal" centers for development (training, architecture selection, hyperparameter tuning, and operating-point calibration; = 25 946) and evaluation ( = 2947) and three "external" centers for calibration ( = 400) and evaluation ( = 16764). Internal centers contributed developmental data, whereas external centers did not. Deep neural networks predicted the presence of ICH and subtypes (intraparenchymal, intraventricular, subarachnoid, subdural, and/or epidural hemorrhage) and segmentations per case. Two ICH confidence scores are discussed: a calibrated classifier entropy score and a Dempster-Shafer score. Evaluation was completed by using receiver operating characteristic curve analysis and report turnaround time (RTAT) modeling on the evaluation set and on confidence score-defined subsets using bootstrapping.
The areas under the receiver operating characteristic curve for ICH were 0.97 (0.97, 0.98) and 0.95 (0.94, 0.95) on internal and external center data, respectively. On 80% of the data stratified by calibrated classifier and Dempster-Shafer scores, the system improved the Youden indexes, increasing them from 0.84 to 0.93 (calibrated classifier) and from 0.84 to 0.92 (Dempster-Shafer) for internal centers and increasing them from 0.78 to 0.88 (calibrated classifier) and from 0.78 to 0.89 (Dempster-Shafer) for external centers ( < .001). Models estimated shorter RTAT for AI-prioritized worklists with confidence measures than for AI-prioritized worklists without confidence measures, shortening RTAT by 27% (calibrated classifier) and 27% (Dempster-Shafer) for internal centers and shortening RTAT by 25% (calibrated classifier) and 27% (Dempster-Shafer) for external centers ( < .001).
AI that provided statistical confidence measures for ICH detection on NCCT scans reliably detected and subtyped hemorrhages, identified high-confidence predictions, and improved worklist prioritization in simulation. CT, Head/Neck, Hemorrhage, Convolutional Neural Network (CNN) . © RSNA, 2022.
介绍一种能在非增强CT(NCCT)头部扫描中自动检测、区分亚型并定位急性或亚急性颅内出血(ICH)的方法;生成检测置信度分数以识别准确性更高的高置信度数据子集;并改善放射学工作列表的优先级排序。此类分数可使临床医生更好地使用人工智能(AI)工具。
这项回顾性研究纳入了来自7个“内部”研发中心(用于训练、架构选择、超参数调整和操作点校准;n = 25946)和评估(n = 2947)以及3个“外部”校准(n = 400)和评估(n = 16764)中心的46057项研究。内部中心提供研发数据,而外部中心不提供。深度神经网络预测每例病例中ICH的存在情况、亚型(脑实质内、脑室内、蛛网膜下腔、硬膜下和/或硬膜外出血)以及分割情况。讨论了两种ICH置信度分数:校准分类器熵分数和Dempster-Shafer分数。通过在评估集以及使用自抽样法在置信度分数定义的子集上进行受试者操作特征曲线分析和报告周转时间(RTAT)建模来完成评估。
在内部和外部中心数据上,ICH的受试者操作特征曲线下面积分别为0.97(0.97,0.98)和0.95(0.94,0.95)。在校准分类器和Dempster-Shafer分数分层的80%的数据中,该系统提高了约登指数,内部中心从0.84提高到0.93(校准分类器)和从0.84提高到0.92(Dempster-Shafer),外部中心从0.78提高到0.88(校准分类器)和从0.78提高到0.89(Dempster-Shafer)(P <.001)。模型估计,对于有置信度测量的AI优先排序工作列表,其RTAT比没有置信度测量的AI优先排序工作列表更短,内部中心RTAT缩短了27%(校准分类器)和27%(Dempster-Shafer),外部中心RTAT缩短了25%(校准分类器)和27%(Dempster-Shafer)(P <.001)。
在NCCT扫描上为ICH检测提供统计置信度测量的AI能够可靠地检测出血液并区分亚型,识别高置信度预测,并在模拟中改善工作列表的优先级排序。CT,头/颈部,出血,卷积神经网络(CNN)。©RSNA,2022年。