文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

深度学习利用手 X 光片预测儿童骨龄的泛化性和偏差。

Generalizability and Bias in a Deep Learning Pediatric Bone Age Prediction Model Using Hand Radiographs.

机构信息

From the University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 670 W Baltimore St, First Floor, Room 1172, Baltimore, MD 21201.

出版信息

Radiology. 2023 Feb;306(2):e220505. doi: 10.1148/radiol.220505. Epub 2022 Sep 27.


DOI:10.1148/radiol.220505
PMID:36165796
Abstract

Background Although deep learning (DL) models have demonstrated expert-level ability for pediatric bone age prediction, they have shown poor generalizability and bias in other use cases. Purpose To quantify generalizability and bias in a bone age DL model measured by performance on external versus internal test sets and performance differences between different demographic groups, respectively. Materials and Methods The winning DL model of the 2017 RSNA Pediatric Bone Age Challenge was retrospectively evaluated and trained on 12 611 pediatric hand radiographs from two U.S. hospitals. The DL model was tested from September 2021 to December 2021 on an internal validation set and an external test set of pediatric hand radiographs with diverse demographic representation. Images reporting ground-truth bone age were included for study. Mean absolute difference (MAD) between ground-truth bone age and the model prediction bone age was calculated for each set. Generalizability was evaluated by comparing MAD between internal and external evaluation sets with use of tests. Bias was evaluated by comparing MAD and clinically significant error rate (rate of errors changing the clinical diagnosis) between demographic groups with use of tests or analysis of variance and χ tests, respectively (statistically significant difference defined as < .05). Results The internal validation set had images from 1425 individuals (773 boys), and the external test set had images from 1202 individuals (mean age, 133 months ± 60 [SD]; 614 boys). The bone age model generalized well to the external test set, with no difference in MAD (6.8 months in the validation set vs 6.9 months in the external set; = .64). Model predictions would have led to clinically significant errors in 194 of 1202 images (16%) in the external test set. The MAD was greater for girls than boys in the internal validation set ( = .01) and in the subcategories of age and Tanner stage in the external test set ( < .001 for both). Conclusion A deep learning (DL) bone age model generalized well to an external test set, although clinically significant sex-, age-, and sexual maturity-based biases in DL bone age were identified. © RSNA, 2022 See also the editorial by Larson in this issue.

摘要

背景 深度学习 (DL) 模型在儿科骨龄预测方面表现出了专家级的能力,但在其他应用场景中表现出较差的泛化能力和偏差。目的 分别通过外部和内部测试集的性能以及不同人群的性能差异,来量化 DL 模型的泛化能力和偏差。材料与方法 回顾性评估了 2017 年 RSNA 儿科骨龄挑战赛的获胜 DL 模型,并在来自美国两家医院的 12011 例儿科手部 X 光片上进行了训练。该 DL 模型于 2021 年 9 月至 2021 年 12 月在内部验证集和外部测试集上进行了测试,这些测试集包含了具有不同人口统计学特征的儿科手部 X 光片。纳入了报告实际骨龄的图像。对于每个数据集,计算实际骨龄与模型预测骨龄之间的平均绝对差值 (MAD)。通过使用 检验比较内部和外部评估集之间的 MAD 来评估泛化能力。通过使用 检验或方差分析和 χ 检验分别比较 MAD 和临床上显著的错误率(改变临床诊断的错误率)来评估偏差(统计学显著差异定义为 <.05)。结果 内部验证集的图像来自 1425 个人(773 名男孩),外部测试集的图像来自 1202 个人(平均年龄为 133 个月±60 [SD];614 名男孩)。骨龄模型很好地泛化到外部测试集,MAD 没有差异(验证集中为 6.8 个月,外部集中为 6.9 个月; =.64)。在外部测试集中,模型预测将导致 1202 张图像中的 194 张(16%)出现临床上显著的错误。在内部验证集和外部测试集的子类别(年龄和性成熟阶段)中,女孩的 MAD 大于男孩( =.01 和 <.001 )。结论 尽管在 DL 骨龄中发现了与性别、年龄和性成熟相关的临床上显著的偏差,但深度学习(DL)骨龄模型很好地泛化到了外部测试集。

相似文献

[1]
Generalizability and Bias in a Deep Learning Pediatric Bone Age Prediction Model Using Hand Radiographs.

Radiology. 2023-2

[2]
Evaluating the Robustness of a Deep Learning Bone Age Algorithm to Clinical Image Variation Using Computational Stress Testing.

Radiol Artif Intell. 2024-5

[3]
Enhancement of Non-Linear Deep Learning Model by Adjusting Confounding Variables for Bone Age Estimation in Pediatric Hand X-rays.

J Digit Imaging. 2023-10

[4]
Multitask Deep Learning for Segmentation and Classification of Primary Bone Tumors on Radiographs.

Radiology. 2021-11

[5]
Deep Learning Measurement of Leg Length Discrepancy in Children Based on Radiographs.

Radiology. 2020-4-21

[6]
The RSNA Pediatric Bone Age Machine Learning Challenge.

Radiology. 2018-11-27

[7]
Deep learning-based automated bone age estimation for Saudi patients on hand radiograph images: a retrospective study.

BMC Med Imaging. 2024-8-1

[8]
Deep Learning Analysis of Chest Radiographs to Triage Patients with Acute Chest Pain Syndrome.

Radiology. 2023-2

[9]
Deep Learning Model for Automated Detection and Classification of Central Canal, Lateral Recess, and Neural Foraminal Stenosis at Lumbar Spine MRI.

Radiology. 2021-7

[10]
Limited generalizability of deep learning algorithm for pediatric pneumonia classification on external data.

Emerg Radiol. 2022-2

引用本文的文献

[1]
Determination of Skeletal Age From Hand Radiographs Using Deep Learning.

Am J Sports Med. 2025-9

[2]
Pediatrics 4.0: the Transformative Impacts of the Latest Industrial Revolution on Pediatrics.

Health Care Anal. 2025-7-21

[3]
An X-ray bone age assessment method for hands and wrists of adolescents in Western China based on feature fusion deep learning models.

Int J Legal Med. 2025-5-22

[4]
Pitfalls and Best Practices in Evaluation of AI Algorithmic Biases in Radiology.

Radiology. 2025-5

[5]
Cross-institutional validation of a polar map-free 3D deep learning model for obstructive coronary artery disease prediction using myocardial perfusion imaging: insights into generalizability and bias.

Eur J Nucl Med Mol Imaging. 2025-4-8

[6]
Bone Age Assessment Using Various Medical Imaging Techniques Enhanced by Artificial Intelligence.

Diagnostics (Basel). 2025-1-23

[7]
The impact of climate change on vulnerable populations in pediatrics: opportunities for AI, digital health, and beyond-a scoping review and selected case studies.

Pediatr Res. 2025-1-29

[8]
Children Are Not Small Adults: Addressing Limited Generalizability of an Adult Deep Learning CT Organ Segmentation Model to the Pediatric Population.

J Imaging Inform Med. 2025-6

[9]
Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center.

Radiol Artif Intell. 2024-11

[10]
Deep learning segmentation of mandible with lower dentition from cone beam CT.

Oral Radiol. 2025-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索