Somasundaram Elanchezhian, Dillman Jonathan R, Crotty Eric J, Trout Andrew T, Towbin Alexander J, Anton Christopher G, Logan Angeline, Wieland Catherine A, Felekey Samantha, Coley Brian D, Brady Samuel L
Department of Radiology, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, 3333 Burnet Ave, MLC 5033, Cincinnati, OH 45229.
Radiol Artif Intell. 2020 Sep 30;2(5):e190226. doi: 10.1148/ryai.2020190226. eCollection 2020 Sep.
To develop and validate a deep learning (DL) algorithm to identify poor-quality lateral airway radiographs.
A total of 1200 lateral airway radiographs obtained in emergency department patients between January 1, 2000, and July 1, 2019, were retrospectively queried from the picture archiving and communication system. Two radiologists classified each radiograph as adequate or inadequate. Disagreements were adjudicated by a third radiologist. The radiographs were used to train and test the DL classifiers. Three technologists and three different radiologists classified the images in the test dataset, and their performance was compared with that of the DL classifiers.
The training set had 961 radiographs and the test set had 239. The best DL classifier (ResNet-50) achieved sensitivity, specificity, and area under the receiver operating characteristic curve of 0.90 (95% confidence interval [CI]: 0.86, 0.94), 0.82 (95% CI: 0.76, 0.90), and 0.86 (95% CI: 0.81, 0.91), respectively. Interrater agreement for technologists was fair (Fleiss κ, 0.36 [95% CI: 0.29, 0.43]), while that for radiologists was moderate (Fleiss κ, 0.59 [95% CI: 0.52, 0.66]). Cohen κ value comparing the consensus rating of ResNet-50 iterations from fivefold cross-validation, consensus technologists' rating, and consensus radiologists' rating to the ground truth were 0.76 (95% CI: 0.63, 0.89), 0.49 (95% CI: 0.37, 0.61), and 0.66 (95% CI: 0.54, 0.78), respectively.
The development and validation of DL classifiers to distinguish between adequate and inadequate lateral airway radiographs is reported. The classifiers performed significantly better than a group of technologists and as well as the radiologists.© RSNA, 2020.
开发并验证一种深度学习(DL)算法,以识别质量不佳的气道侧位X线片。
回顾性查询图片存档与通信系统,获取2000年1月1日至2019年7月1日期间急诊科患者的1200张气道侧位X线片。两名放射科医生将每张X线片分类为合格或不合格。分歧由第三名放射科医生裁决。这些X线片用于训练和测试DL分类器。三名技术人员和三名不同的放射科医生对测试数据集中的图像进行分类,并将他们的表现与DL分类器的表现进行比较。
训练集有961张X线片,测试集有239张。最佳DL分类器(ResNet-50)的灵敏度、特异度和受试者操作特征曲线下面积分别为0.90(95%置信区间[CI]:0.86,0.94)、0.82(95%CI:0.76,0.90)和0.86(95%CI:0.81,0.91)。技术人员之间的观察者间一致性为中等(Fleiss κ,0.36[95%CI:0.29,0.43]),而放射科医生之间的一致性为中等(Fleiss κ,(0.59[95%CI:0.52,0.66])。将五重交叉验证的ResNet-50迭代的一致性评级、技术人员的一致性评级和放射科医生的一致性评级与真实情况进行比较的Cohen κ值分别为0.76(95%CI:0.63,0.89)、0.49(95%CI:0.37,0.61)和0.66(95%CI:0.54,0.78)。
报告了区分气道侧位X线片合格与不合格的DL分类器的开发和验证。这些分类器的表现明显优于一组技术人员以及放射科医生。©RSNA,2020。