Peng L Q, Wan L, Wang M W, Li Z, Wang P, Liu T A, Wang Y H, Zhao H
Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun-Yat Sen University, Guangzhou 510080, China.
Shanghai Key Laboratory of Forensic Medicine, Key Laboratory of Forensic Science, Ministry of Justice, Shanghai Forensic Service Platform, Academy of Forensic Science, Shanghai 200063, China.
Fa Yi Xue Za Zhi. 2020 Oct;36(5):622-630. doi: 10.12116/j.issn.1004-5619.2020.05.004.
Objective To compare the performance of three deep-learning models (VGG19, Inception-V3 and Inception-ResNet-V2) in automatic bone age assessment based on pelvic X-ray radiographs. Methods A total of 962 pelvic X ray radiographs taken from adolescents (481 males, 481 females) aged from 11.0 to 21.0 years in five provinces and cities of China were collected, preprocessed and used as objects of study. Eighty percent of these X ray radiographs were divided into training set and validation set with random sampling method and used for model fitting and hyper-parameters adjustment. Twenty percent were used as test sets, to evaluate the ability of model generalization. The performances of the three models were assessed by comparing the root mean square error (RMSE), mean absolute error (MAE) and Bland-Altman plots between the model estimates and the chronological ages. Results The mean RMSE and MAE between bone age estimates of the VGG19 model and the chronological ages were 1.29 and 1.02 years, respectively. The mean RMSE and MAE between bone age estimates of the Inception-V3 model and the chronological ages were 1.17 and 0.82 years, respectively. The mean RMSE and MAE between bone age estimates of the Inception-ResNet-V2 model and the chronological ages were 1.11 and 0.84 years, respectively. The Bland-Altman plots showed that the mean value of differences between bone age estimates of Inception-ResNet-V2 model and the chronological ages was the lowest. Conclusion In the automatic bone age assessment of adolescent pelvis, the Inception-ResNet-V2 model performs the best while the Inception-V3 model achieves a similar accuracy as VGG19 model.
目的 比较三种深度学习模型(VGG19、Inception-V3和Inception-ResNet-V2)在基于骨盆X线片的骨龄自动评估中的性能。方法 收集了来自中国五个省市11.0至21.0岁青少年(481名男性,481名女性)的962张骨盆X线片,进行预处理后作为研究对象。采用随机抽样方法将其中80%的X线片分为训练集和验证集,用于模型拟合和超参数调整。20%用作测试集,以评估模型的泛化能力。通过比较模型估计值与实际年龄之间的均方根误差(RMSE)、平均绝对误差(MAE)和Bland-Altman图来评估这三种模型的性能。结果 VGG19模型骨龄估计值与实际年龄之间的平均RMSE和MAE分别为1.29岁和1.02岁。Inception-V3模型骨龄估计值与实际年龄之间的平均RMSE和MAE分别为1.17岁和0.82岁。Inception-ResNet-V2模型骨龄估计值与实际年龄之间的平均RMSE和MAE分别为1.11岁和0.84岁。Bland-Altman图显示Inception-ResNet-V2模型骨龄估计值与实际年龄之间差异的平均值最低。结论 在青少年骨盆骨龄自动评估中,Inception-ResNet-V2模型表现最佳,而Inception-V3模型与VGG19模型的准确性相似。