Pan Ian, Baird Grayson L, Mutasa Simukayi, Merck Derek, Ruzal-Shapiro Carrie, Swenson David W, Ayyala Rama S
Department of Diagnostic Imaging, Rhode Island Hospital/Hasbro Children's Hospital, The Warren Alpert Medical School of Brown University, 593 Eddy St, Providence, RI 02903 (I.P., D.W.S., R.S.A.); Department of Diagnostic Imaging and Lifespan Biostatistics Core, Rhode Island Hospital, Providence, RI (G.L.B.); Department of Radiology, Columbia University Medical Center, New York, NY (S.M., C.R.); and Department of Emergency Medicine, University of Florida Shands Hospital, Gainesville, Fla (D.M.).
Radiol Artif Intell. 2020 Jul 29;2(4):e190198. doi: 10.1148/ryai.2020190198. eCollection 2020 Jul.
To develop a deep learning approach to bone age assessment based on a training set of developmentally normal pediatric hand radiographs and to compare this approach with automated and manual bone age assessment methods based on Greulich and Pyle (GP).
In this retrospective study, a convolutional neural network (trauma hand radiograph-trained deep learning bone age assessment method [TDL-BAAM]) was trained on 15 129 frontal view pediatric trauma hand radiographs obtained between December 14, 2009, and May 31, 2017, from Children's Hospital of New York, to predict chronological age. A total of 214 trauma hand radiographs from Hasbro Children's Hospital were used as an independent test set. The test set was rated by the TDL-BAAM model as well as a GP-based deep learning model (GPDL-BAAM) and two pediatric radiologists (radiologists 1 and 2) using the GP method. All ratings were compared with chronological age using mean absolute error (MAE), and standard concordance analyses were performed.
The MAE of the TDL-BAAM model was 11.1 months, compared with 12.9 months for GPDL-BAAM ( = .0005), 14.6 months for radiologist 1 ( < .0001), and 16.0 for radiologist 2 ( < .0001). For TDL-BAAM, 95.3% of predictions were within 24 months of chronological age compared with 91.6% for GPDL-BAAM ( = .096), 86.0% for radiologist 1 ( < .0001), and 84.6% for radiologist 2 ( < .0001). Concordance was high between all methods and chronological age (intraclass coefficient > 0.93). Deep learning models demonstrated a systematic bias with a tendency to overpredict age for younger children versus radiologists who showed a consistent mean bias.
A deep learning model trained on pediatric trauma hand radiographs is on par with automated and manual GP-based methods for bone age assessment and provides a foundation for developing population-specific deep learning algorithms for bone age assessment in modern pediatric populations.© RSNA, 2020See also the commentary by Halabi in this issue.
基于发育正常的儿科手部X线片训练集,开发一种深度学习方法用于骨龄评估,并将该方法与基于格雷利希和派尔(GP)法的自动及手动骨龄评估方法进行比较。
在这项回顾性研究中,使用2009年12月14日至2017年5月31日期间从纽约儿童医院获取的15129张儿童创伤手部正位X线片,训练一个卷积神经网络(创伤手部X线片训练的深度学习骨龄评估方法[TDL - BAAM])来预测实际年龄。将来自哈斯波罗儿童医院的214张创伤手部X线片用作独立测试集。测试集由TDL - BAAM模型、基于GP的深度学习模型(GPDL - BAAM)以及两名儿科放射科医生(放射科医生1和2)使用GP方法进行评分。所有评分均使用平均绝对误差(MAE)与实际年龄进行比较,并进行标准一致性分析。
TDL - BAAM模型的MAE为11.1个月,GPDL - BAAM为12.9个月(P = 0.0005),放射科医生1为14.6个月(P < 0.0001),放射科医生2为16.0个月(P < 0.0001)。对于TDL - BAAM,95.3%的预测值在实际年龄的24个月范围内,而GPDL - BAAM为91.6%(P = 0.096),放射科医生1为86.0%(P < 0.0001),放射科医生2为84.6%(P < 0.0001)。所有方法与实际年龄之间的一致性都很高(组内相关系数>0.93)。深度学习模型表现出一种系统偏差,即相较于表现出一致平均偏差的放射科医生,倾向于对年幼儿童的年龄预测过高。
基于儿科创伤手部X线片训练的深度学习模型在骨龄评估方面与基于GP的自动和手动方法相当,并为开发针对现代儿科人群骨龄评估的特定人群深度学习算法奠定了基础。©RSNA,2020另见本期哈拉比的评论。