Suppr超能文献

人工智能在商业骨折检测产品中的应用:诊断试验准确性的系统评价和荟萃分析。

Artificial intelligence in commercial fracture detection products: a systematic review and meta-analysis of diagnostic test accuracy.

机构信息

Department of Orthopaedic Surgery and Traumatology, Bern University Hospital, Inselspital, University of Bern, Bern, Switzerland.

University of Bern, Bern, Switzerland.

出版信息

Sci Rep. 2024 Oct 4;14(1):23053. doi: 10.1038/s41598-024-73058-8.

Abstract

Conventional radiography (CR) is primarily utilized for fracture diagnosis. Artificial intelligence (AI) for CR is a rapidly growing field aimed at enhancing efficiency and increasing diagnostic accuracy. However, the diagnostic performance of commercially available AI fracture detection solutions (CAAI-FDS) for CR in various anatomical regions, their synergy with human assessment, as well as the influence of industry funding on reported accuracy are unknown. Peer-reviewed diagnostic test accuracy (DTA) studies were identified through a systematic review on Pubmed and Embase. Diagnostic performance measures were extracted especially for different subgroups such as product, type of rater (stand-alone AI, human unaided, human aided), funding, and anatomical region. Pooled measures were obtained with a bivariate random effects model. The impact of rater was evaluated with comparative meta-analysis. Seventeen DTA studies of seven CAAI-FDS analyzing 38,978 x-rays with 8,150 fractures were included. Stand-alone AI studies (n = 15) evaluated five CAAI-FDS; four with good sensitivities (> 90%) and moderate specificities (80-90%) and one with very poor sensitivity (< 60%) and excellent specificity (> 95%). Pooled sensitivities were good to excellent, and specificities were moderate to good in all anatomical regions (n = 7) apart from ribs (n = 4; poor sensitivity / moderate specificity) and spine (n = 4; excellent sensitivity / poor specificity). Funded studies (n = 4) had higher sensitivity (+ 5%) and lower specificity (-4%) than non-funded studies (n = 11). Sensitivity did not differ significantly between stand-alone AI and human AI aided ratings (p = 0.316) but specificity was significantly higher the latter group (p < 0.001). Sensitivity was significant lower in human unaided compared to human AI aided respectively stand-alone AI ratings (both p ≤ 0.001); specificity was higher in human unaided ratings compared to stand-alone AI (p < 0.001) and showed no significant differences AI aided ratings (p = 0.316). The study demonstrates good diagnostic accuracy across most CAAI-FDS and anatomical regions, with the highest performance achieved when used in conjunction with human assessment. Diagnostic accuracy appears lower for spine and rib fractures. The impact of industry funding on reported performance is small.

摘要

传统放射学(CR)主要用于骨折诊断。用于 CR 的人工智能(AI)是一个快速发展的领域,旨在提高效率和增加诊断准确性。然而,目前尚不清楚商业上可用的 CR 人工智能骨折检测解决方案(CAAI-FDS)在不同解剖区域的诊断性能、它们与人工评估的协同作用,以及报告的准确性受行业资助的影响。通过对 Pubmed 和 Embase 的系统评价,确定了同行评审的诊断测试准确性(DTA)研究。专门提取了不同亚组(产品、评估者类型(独立 AI、人工无辅助、人工辅助)、资金和解剖区域)的诊断性能指标。使用双变量随机效应模型获得汇总指标。通过比较荟萃分析评估了评估者的影响。纳入了 17 项关于 7 种 CAAI-FDS 的 DTA 研究,共分析了 38978 张 X 射线和 8150 处骨折。15 项独立 AI 研究(n=15)评估了 5 种 CAAI-FDS;其中 4 种具有较高的敏感性(>90%)和中等特异性(80-90%),1 种具有较差的敏感性(<60%)和较好的特异性(>95%)。汇总敏感性在所有解剖区域(肋骨 n=4;敏感性差/特异性中等和脊柱 n=4;敏感性极好/特异性差除外)均为良好至极好,特异性为中度至良好。与非资助研究(n=11)相比,资助研究(n=4)的敏感性(+5%)更高,特异性(-4%)更低。独立 AI 与人工 AI 辅助评分之间的敏感性无显著差异(p=0.316),但后者的特异性显著更高(p<0.001)。与人工 AI 辅助评分相比,人工无辅助评分的敏感性显著降低(p≤0.001);与独立 AI 评分相比,特异性更高(p<0.001),与 AI 辅助评分相比无显著差异(p=0.316)。该研究表明,大多数 CAAI-FDS 和解剖区域的诊断准确性均较高,与人工评估相结合时性能最高。脊柱和肋骨骨折的诊断准确性较低。行业资助对报告性能的影响很小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b46/11452402/fb487a63c974/41598_2024_73058_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验