Díaz Moreno Alejandro, Cano Alonso Raquel, Fernández Alfonso Ana, Álvarez Vázquez Ana, Carrascoso Arranz Javier, López Alcolea Julia, García Castellanos David, Sanabria Greciano Lucía, Recio Rodríguez Manuel, Andreu-Vázquez Cristina, Thuissard Vasallo Israel John, Martínez De Vega Vicente
Hospital Universitario QuironSalud Madrid, 28223 Madrid, Spain.
Department of Medicine Faculty of Medicine, Health and Sports Universidad Europea de Madrid, 28670 Madrid, Spain.
Diagnostics (Basel). 2025 Feb 18;15(4):491. doi: 10.3390/diagnostics15040491.
: The growing use of artificial intelligence (AI) in musculoskeletal radiographs presents significant potential to improve diagnostic accuracy and optimize clinical workflow. However, assessing its performance in clinical environments is essential for successful implementation. We hypothesized that our AI applied to urgent bone X-rays could detect fractures, joint dislocations, and effusion with high sensitivity (Sens) and specificity (Spec). The specific objectives of our study were as follows: 1. To determine the Sens and Spec rates of AI in detecting bone fractures, dislocations, and elbow joint effusion compared to the gold standard (GS). 2. To evaluate the concordance rate between AI and radiology residents (RR). 3. To compare the proportion of doubtful results identified by AI and the RR, and the rates confirmed by GS. : We conducted an observational, double-blind, retrospective study on adult bone X-rays (BXRs) referred from the emergency department at our center between October and November 2022, with a final sample of 792 BXRs, categorized into three groups: large joints, small joints, and long-flat bones. Our AI system detects fractures, dislocations, and elbow effusions, providing results as positive, negative, or doubtful. We compared the diagnostic performance of AI and the RR against a senior radiologist (GS). : The study population's median age was 48 years; 48.6% were male. Statistical analysis showed Sens = 90.6% and Spec = 98% for fracture detection by the RR, and 95.8% and 97.6% by AI. The RR achieved higher Sens (77.8%) and Spec (100%) for dislocation detection compared to AI. The Kappa coefficient between RR and AI was 0.797 for fractures in large joints, and concordance was considered acceptable for all other variables. We also analyzed doubtful cases and their confirmation by GS. Additionally, we analyzed findings not detected by AI, such as chronic fractures, arthropathy, focal lesions, and anatomical variants. : This study assessed the impact of AI in a real-world clinical setting, comparing its performance with that of radiologists (both in training and senior). AI achieved high Sens, Spec, and AUC in bone fracture detection and showed strong concordance with the RR. In conclusion, AI has the potential to be a valuable screening tool, helping reduce missed diagnoses in clinical practice.
人工智能(AI)在肌肉骨骼X光片中的应用日益广泛,具有显著潜力来提高诊断准确性并优化临床工作流程。然而,评估其在临床环境中的性能对于成功实施至关重要。我们假设,应用于紧急骨骼X光检查的人工智能能够以高灵敏度(Sens)和特异性(Spec)检测骨折、关节脱位和积液。本研究的具体目标如下:1. 与金标准(GS)相比,确定人工智能在检测骨折、脱位和肘关节积液方面的灵敏度和特异性率。2. 评估人工智能与放射科住院医师(RR)之间的一致性率。3. 比较人工智能和放射科住院医师识别的可疑结果比例,以及经金标准确认的比率。
我们对2022年10月至11月期间从我们中心急诊科转诊的成人骨骼X光片(BXRs)进行了一项观察性、双盲、回顾性研究,最终样本为792张骨骼X光片,并分为三组:大关节、小关节和长扁骨。我们的人工智能系统检测骨折、脱位和肘关节积液,并给出阳性、阴性或可疑的结果。我们将人工智能和放射科住院医师的诊断性能与一位资深放射科医生(金标准)进行了比较。
研究人群的中位年龄为48岁;48.6%为男性。统计分析显示,放射科住院医师检测骨折的灵敏度为90.6%,特异性为98%,而人工智能的灵敏度为95.8%,特异性为97.6%。与人工智能相比,放射科住院医师检测脱位的灵敏度(77.8%)和特异性(100%)更高。对于大关节骨折,放射科住院医师和人工智能之间的Kappa系数为0.797,所有其他变量的一致性被认为是可接受的。我们还分析了可疑病例及其经金标准的确认情况。此外,我们分析了未被人工智能检测到的结果,如慢性骨折(骨折不愈合)、关节病、局灶性病变和解剖变异。
本研究评估了人工智能在实际临床环境中的影响,并将其性能与放射科医生(包括实习医生和资深医生)的性能进行了比较。人工智能在骨折检测中实现了高灵敏度、特异性和曲线下面积(AUC),并与放射科住院医师表现出很强的一致性。总之,人工智能有潜力成为一种有价值的筛查工具,有助于减少临床实践中的漏诊。