Ma Yuntong, Bauer Justin L, Yoon Acacia H, Beaulieu Christopher F, Yoon Luke, Do Bao H, Fang Charles X
Department of Radiology, San Francisco VA Medical Center, 4150 Clement St, San Francisco, CA, 94121, USA.
Department of Radiology, Stanford Medicine. 300 Pasteur Dr, Palo Alto, CA, 94304, USA.
J Imaging Inform Med. 2025 Apr;38(2):988-996. doi: 10.1007/s10278-024-01263-y. Epub 2024 Sep 12.
To develop a deep learning model for automated classification of orthopedic hardware on pelvic and hip radiographs, which can be clinically implemented to decrease radiologist workload and improve consistency among radiology reports.
Pelvic and hip radiographs from 4279 studies in 1073 patients were retrospectively obtained and reviewed by musculoskeletal radiologists. Two convolutional neural networks, EfficientNet-B4 and NFNet-F3, were trained to perform the image classification task into the following most represented categories: no hardware, total hip arthroplasty (THA), hemiarthroplasty, intramedullary nail, femoral neck cannulated screws, dynamic hip screw, lateral blade/plate, THA with additional femoral fixation, and post-infectious hip. Model performance was assessed on an independent test set of 851 studies from 262 patients and compared to individual performance of five subspecialty-trained radiologists using leave-one-out analysis against an aggregate gold standard label.
For multiclass classification, the area under the receiver operating characteristic curve (AUC) for NFNet-F3 was 0.99 or greater for all classes, and EfficientNet-B4 0.99 or greater for all classes except post-infectious hip, with an AUC of 0.97. When compared with human observers, models achieved an accuracy of 97%, which is non-inferior to four out of five radiologists and outperformed one radiologist. Cohen's kappa coefficient for both models ranged from 0.96 to 0.97, indicating excellent inter-reader agreement.
A deep learning model can be used to classify a range of orthopedic hip hardware with high accuracy and comparable performance to subspecialty-trained radiologists.
开发一种深度学习模型,用于对骨盆和髋部X光片上的骨科植入物进行自动分类,该模型可在临床上应用,以减轻放射科医生的工作量并提高放射学报告之间的一致性。
回顾性收集了1073例患者的4279份骨盆和髋部X光片,并由肌肉骨骼放射科医生进行了检查。训练了两个卷积神经网络,即EfficientNet-B4和NFNet-F3,以将图像分类任务执行到以下最具代表性的类别中:无植入物、全髋关节置换术(THA)、半髋关节置换术、髓内钉、股骨颈空心钉、动力髋螺钉、外侧刀片/钢板、带额外股骨固定的THA以及感染后髋关节。在来自262例患者的851份研究的独立测试集上评估模型性能,并与五名经过专科培训的放射科医生的个人性能进行比较,采用留一法分析与汇总的金标准标签进行对比。
对于多类分类,NFNet-F3在所有类别的受试者工作特征曲线(AUC)下面积均为0.99或更高,EfficientNet-B4在除感染后髋关节(AUC为0.97)之外的所有类别中为0.99或更高。与人类观察者相比,模型的准确率达到了97%,不低于五名放射科医生中的四名,且优于一名放射科医生。两个模型的Cohen's kappa系数范围为0.96至0.97,表明读者间一致性极佳。
深度学习模型可用于对一系列骨科髋部植入物进行高精度分类,其性能与经过专科培训的放射科医生相当。