Department of Radiology, Balgrist University Hospital
Faculty of Medicine, University of Zurich
Invest Radiol. 2020 Aug;55(8):499-506. doi: 10.1097/RLI.0000000000000664.
The aim of this study was to clinically validate a Deep Convolutional Neural Network (DCNN) for the detection of surgically proven anterior cruciate ligament (ACL) tears in a large patient cohort and to analyze the effect of magnetic resonance examinations from different institutions, varying protocols, and field strengths.
After ethics committee approval, this retrospective analysis of prospectively collected data was performed on 512 consecutive subjects, who underwent knee magnetic resonance imaging (MRI) in a total of 59 different institutions followed by arthroscopic knee surgery at our institution. The DCNN and 3 fellowship-trained full-time academic musculoskeletal radiologists evaluated the MRI examinations for full-thickness ACL tears independently. Surgical reports served as the reference standard. Statistics included diagnostic performance metrics, including sensitivity, specificity, area under the receiver operating curve ("AUC ROC"), and kappa statistics. P values less than 0.05 were considered to represent statistical significance.
Anterior cruciate ligament tears were present in 45.7% (234/512) and absent in 54.3% (278/512) of the subjects. The DCNN had a sensitivity of 96.1%, which was not significantly different from the readers (97.5%-97.9%; all P ≥ 0.118), but significantly lower specificity of 93.1% (readers, 99.6%-100%; all P < 0.001) and "AUC ROC" of 0.935 (readers, 0.989-0.991; all P < 0.001) for the entire cohort. Subgroup analysis showed a significantly lower sensitivity, specificity, and "AUC ROC" of the DCNN for outside MRI (92.5%, 87.1%, and 0.898, respectively) than in-house MRI (99.0%, 94.4%, and 0.967, respectively) examinations (P = 0.026, P = 0.043, and P < 0.05, respectively). There were no significant differences in DCNN performance for 1.5-T and 3-T MRI examinations (all P ≥ 0.753, respectively).
Deep Convolutional Neural Network performance of ACL tear diagnosis can approach performance levels similar to fellowship-trained full-time academic musculoskeletal radiologists at 1.5 T and 3 T; however, the performance may decrease with increasing MRI examination heterogeneity.
本研究旨在通过对大样本队列的手术证实的前交叉韧带(ACL)撕裂进行临床验证,分析来自不同机构、不同方案和场强的磁共振检查对深度卷积神经网络(DCNN)的影响。
本回顾性分析前瞻性收集的数据,共纳入 512 例连续患者,这些患者在 59 家不同机构进行了膝关节磁共振成像(MRI)检查,随后在我院进行了关节镜膝关节手术。DCNN 和 3 位经过 fellowship培训的全职肌肉骨骼放射科医生对 MRI 检查进行了独立的全层 ACL 撕裂评估。手术报告作为参考标准。统计分析包括诊断性能指标,包括敏感性、特异性、接收者操作特征曲线下面积("AUC ROC")和kappa 统计。P 值小于 0.05 表示具有统计学意义。
在 512 例患者中,45.7%(234/512)存在 ACL 撕裂,54.3%(278/512)无 ACL 撕裂。DCNN 的敏感性为 96.1%,与读者(97.5%-97.9%;所有 P≥0.118)无显著差异,但特异性显著降低,为 93.1%(读者,99.6%-100%;所有 P<0.001),AUC ROC 为 0.935(读者,0.989-0.991;所有 P<0.001)。亚组分析显示,DCNN 对外部 MRI(敏感性 92.5%、特异性 87.1%和 AUC ROC 0.898)的敏感性、特异性和 AUC ROC 均显著低于内部 MRI(敏感性 99.0%、特异性 94.4%和 AUC ROC 0.967)(P=0.026,P=0.043,P<0.05)。1.5-T 和 3-T MRI 检查的 DCNN 性能无显著差异(所有 P≥0.753)。
ACL 撕裂诊断的 DCNN 性能可接近经过 fellowship培训的全职学术肌肉骨骼放射科医生在 1.5 T 和 3 T 时的水平;然而,随着 MRI 检查异质性的增加,其性能可能会下降。