Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China.
West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610000, China.
Eur Radiol. 2022 Oct;32(10):7196-7216. doi: 10.1007/s00330-022-08956-4. Epub 2022 Jun 27.
To systematically quantify the diagnostic accuracy and identify potential covariates affecting the performance of artificial intelligence (AI) in diagnosing orthopedic fractures.
PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies on AI applications in diagnosing orthopedic fractures from inception to September 29, 2021. Pooled sensitivity and specificity and the area under the receiver operating characteristic curves (AUC) were obtained. This study was registered in the PROSPERO database prior to initiation (CRD 42021254618).
Thirty-nine were eligible for quantitative analysis. The overall pooled AUC, sensitivity, and specificity were 0.96 (95% CI 0.94-0.98), 90% (95% CI 87-92%), and 92% (95% CI 90-94%), respectively. In subgroup analyses, multicenter designed studies yielded higher sensitivity (92% vs. 88%) and specificity (94% vs. 91%) than single-center studies. AI demonstrated higher sensitivity with transfer learning (with vs. without: 92% vs. 87%) or data augmentation (with vs. without: 92% vs. 87%), compared to those without. Utilizing plain X-rays as input images for AI achieved results comparable to CT (AUC 0.96 vs. 0.96). Moreover, AI achieved comparable results to humans (AUC 0.97 vs. 0.97) and better results than non-expert human readers (AUC 0.98 vs. 0.96; sensitivity 95% vs. 88%).
AI demonstrated high accuracy in diagnosing orthopedic fractures from medical images. Larger-scale studies with higher design quality are needed to validate our findings.
• Multicenter study design, application of transfer learning, and data augmentation are closely related to improving the performance of artificial intelligence models in diagnosing orthopedic fractures. • Utilizing plain X-rays as input images for AI to diagnose fractures achieved results comparable to CT (AUC 0.96 vs. 0.96). • AI achieved comparable results to humans (AUC 0.97 vs. 0.97) but was superior to non-expert human readers (AUC 0.98 vs. 0.96, sensitivity 95% vs. 88%) in diagnosing fractures.
系统评估人工智能(AI)在诊断骨科骨折中的诊断准确性,并确定影响其性能的潜在混杂因素。
从建库至 2021 年 9 月 29 日,系统检索 PubMed、Embase、Web of Science 和 Cochrane Library 中关于 AI 应用于诊断骨科骨折的研究。获取汇总后的敏感度、特异度和受试者工作特征曲线下面积(AUC)。本研究在开展前于 PROSPERO 数据库进行了注册(CRD42021254618)。
39 项研究符合定量分析的纳入标准。总体汇总的 AUC、敏感度和特异度分别为 0.96(95%CI 0.94-0.98)、90%(95%CI 87-92%)和 92%(95%CI 90-94%)。亚组分析显示,多中心设计研究得出的敏感度(92% vs. 88%)和特异度(94% vs. 91%)均高于单中心研究。与无数据增强或迁移学习的 AI 相比,有数据增强或迁移学习的 AI 显示出更高的敏感度(有 vs. 无:92% vs. 87%)和特异度(有 vs. 无:94% vs. 91%)。使用普通 X 光片作为 AI 的输入图像,其结果与 CT 相当(AUC 为 0.96 vs. 0.96)。此外,AI 的表现与人类相当(AUC 为 0.97 vs. 0.97),优于非专家级人类读者(AUC 为 0.98 vs. 0.96;敏感度 95% vs. 88%)。
AI 应用于医学影像诊断骨科骨折具有较高的准确性。需要开展设计质量更高的大样本研究来验证我们的研究结果。
多中心研究设计、迁移学习和数据增强的应用与提高 AI 模型在诊断骨科骨折方面的性能密切相关。
使用普通 X 光片作为 AI 的输入图像进行骨折诊断,其结果与 CT 相当(AUC 为 0.96 vs. 0.96)。
AI 在诊断骨折方面与人类表现相当(AUC 为 0.97 vs. 0.97),但优于非专家级人类读者(AUC 为 0.98 vs. 0.96,敏感度 95% vs. 88%)。