Jung Jongyun, Dai Jingyuan, Liu Bowen, Wu Qing
Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America.
Department of Mathematics and Statistics, Division of Computing, Analytics, and Mathematics, School of Science and Engineering (Bowen Liu), University of Missouri-Kansas City, Kansas City, Missouri, United States of America.
PLOS Digit Health. 2024 Jan 30;3(1):e0000438. doi: 10.1371/journal.pdig.0000438. eCollection 2024 Jan.
Artificial Intelligence (AI), encompassing Machine Learning and Deep Learning, has increasingly been applied to fracture detection using diverse imaging modalities and data types. This systematic review and meta-analysis aimed to assess the efficacy of AI in detecting fractures through various imaging modalities and data types (image, tabular, or both) and to synthesize the existing evidence related to AI-based fracture detection. Peer-reviewed studies developing and validating AI for fracture detection were identified through searches in multiple electronic databases without time limitations. A hierarchical meta-analysis model was used to calculate pooled sensitivity and specificity. A diagnostic accuracy quality assessment was performed to evaluate bias and applicability. Of the 66 eligible studies, 54 identified fractures using imaging-related data, nine using tabular data, and three using both. Vertebral fractures were the most common outcome (n = 20), followed by hip fractures (n = 18). Hip fractures exhibited the highest pooled sensitivity (92%; 95% CI: 87-96, p< 0.01) and specificity (90%; 95% CI: 85-93, p< 0.01). Pooled sensitivity and specificity using image data (92%; 95% CI: 90-94, p< 0.01; and 91%; 95% CI: 88-93, p < 0.01) were higher than those using tabular data (81%; 95% CI: 77-85, p< 0.01; and 83%; 95% CI: 76-88, p < 0.01), respectively. Radiographs demonstrated the highest pooled sensitivity (94%; 95% CI: 90-96, p < 0.01) and specificity (92%; 95% CI: 89-94, p< 0.01). Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability. AI displays high diagnostic accuracy for various fracture outcomes, indicating potential utility in healthcare systems for fracture diagnosis. However, enhanced transparency in reporting and adherence to standardized guidelines are necessary to improve the clinical applicability of AI. Review Registration: PROSPERO (CRD42021240359).
人工智能(AI),包括机器学习和深度学习,越来越多地被应用于使用各种成像方式和数据类型进行骨折检测。本系统评价和荟萃分析旨在评估人工智能通过各种成像方式和数据类型(图像、表格或两者兼有)检测骨折的有效性,并综合现有的与基于人工智能的骨折检测相关的证据。通过在多个电子数据库中进行不限时间的检索,确定了开发和验证用于骨折检测的人工智能的同行评审研究。使用分层荟萃分析模型计算合并敏感性和特异性。进行了诊断准确性质量评估以评估偏倚和适用性。在66项符合条件的研究中,54项使用与成像相关的数据识别骨折,9项使用表格数据,3项同时使用两者。椎体骨折是最常见的结果(n = 20),其次是髋部骨折(n = 18)。髋部骨折表现出最高的合并敏感性(92%;95%CI:87 - 96,p < 0.01)和特异性(90%;95%CI:85 - 93,p < 0.01)。使用图像数据的合并敏感性(92%;95%CI:90 - 94,p < 0.01)和特异性(91%;95%CI:88 - 93,p < 0.01)分别高于使用表格数据的合并敏感性(81%;95%CI:77 - 85,p < 0.01)和特异性(83%;95%CI:76 - 88,p < 0.01)。X线片显示出最高的合并敏感性(94%;95%CI:90 - 96,p < 0.01)和特异性(92%;95%CI:89 - 94,p < 0.01)。在评估诊断准确性的偏倚和适用性时,患者选择和参考标准是主要关注点。人工智能对各种骨折结果显示出较高的诊断准确性,表明其在医疗系统中用于骨折诊断具有潜在效用。然而,为提高人工智能的临床适用性,有必要提高报告的透明度并遵守标准化指南。综述注册:PROSPERO(CRD42021240359)。