Wang Shuhao, Wu Dijia, Ye Lifang, Chen Zirong, Zhan Yiqiang, Li Yuehua
Department of Radiology, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, No. 600, Yi Shan Road, Shanghai, 200233, China.
Shanghai United Imaging Intelligence Co., Ltd., No. 2879, Long Teng Boulevard, Shanghai, 200232, China.
Eur Radiol. 2023 Mar;33(3):1824-1834. doi: 10.1007/s00330-022-09156-w. Epub 2022 Oct 10.
To evaluate deep neural networks for automatic rib fracture detection on thoracic CT scans and to compare its performance with that of attending-level radiologists using a large amount of datasets from multiple medical institutions.
In this retrospective study, an internal dataset of 12,208 emergency room (ER) trauma patients and an external dataset of 1613 ER trauma patients taking chest CT scans were recruited. Two cascaded deep neural networks based on an extended U-Net architecture were developed to segment ribs and detect rib fractures respectively. Model performance was evaluated with a 95% confidence interval (CI) on both the internal and external dataset, and compared with attending-level radiologist readings using t test.
On the internal dataset, the AUC of the model for detecting fractures at per-rib level was 0.970 (95% CI: 0.968, 0.972) with sensitivity of 93.3% (95% CI: 92.0%, 94.4%) at a specificity of 98.4% (95% CI: 98.3%, 98.5%). On the external dataset, the model obtained an AUC of 0.943 (95% CI: 0.941, 0.945) with sensitivity of 86.2% (95% CI: 85.0%, 87.3%) at a specificity of 98.8% (95% CI: 98.7%, 98.9%), compared to the sensitivity of 70.5% (95% CI: 69.3%, 71.8%) (p < .0001) and specificity of 98.8% (95% CI: 98.7%, 98.9%) (p = 0.175) by attending radiologists.
The proposed DL model is a feasible approach to identify rib fractures on chest CT scans, at the very least, reaching a level on par with attending-level radiologists.
• Deep learning-based algorithms automatically detected rib fractures with high sensitivity and reasonable specificity on chest CT scans. • The performance of deep learning-based algorithms reached comparable diagnostic measures with attending level radiologists for rib fracture detection on chest CT scans. • The deep learning models, similar to human readers, were susceptible to the inconspicuity and ambiguity of target lesions. More training data was required for subtle lesions to achieve comparable detection performance.
评估深度神经网络在胸部CT扫描中自动检测肋骨骨折的能力,并使用来自多个医疗机构的大量数据集,将其性能与主治级放射科医生的性能进行比较。
在这项回顾性研究中,招募了一个包含12208名急诊室(ER)创伤患者的内部数据集和一个包含1613名接受胸部CT扫描的ER创伤患者的外部数据集。开发了两个基于扩展U-Net架构的级联深度神经网络,分别用于分割肋骨和检测肋骨骨折。在内部和外部数据集上,使用95%置信区间(CI)评估模型性能,并使用t检验与主治级放射科医生的读片结果进行比较。
在内部数据集上,模型在每根肋骨水平检测骨折的AUC为0.970(95%CI:0.968,0.972),敏感性为93.3%(95%CI:92.0%,94.4%),特异性为98.4%(95%CI:98.3%,98.5%)。在外部数据集上,模型的AUC为0.943(95%CI:0.941,0.945),敏感性为86.2%(95%CI:85.0%,87.3%),特异性为98.8%(95%CI:98.7%,98.9%),而主治放射科医生的敏感性为70.5%(95%CI:69.3%,71.8%)(p <.0001),特异性为98.8%(95%CI:98.7%,98.9%)(p = 0.175)。
所提出的深度学习模型是一种在胸部CT扫描中识别肋骨骨折的可行方法,至少达到了与主治级放射科医生相当的水平。
• 基于深度学习的算法在胸部CT扫描中以高敏感性和合理的特异性自动检测肋骨骨折。• 基于深度学习的算法在胸部CT扫描中检测肋骨骨折的性能与主治级放射科医生达到了可比的诊断指标。• 深度学习模型与人类读者相似,容易受到目标病变的不明显性和模糊性的影响。对于细微病变,需要更多的训练数据才能达到可比的检测性能。