University of Witten/Herdecke, Witten, Germany.
Pediatric Endocrinology, University Children's Hospital, Tübingen, Germany.
Sci Rep. 2022 Apr 16;12(1):6388. doi: 10.1038/s41598-022-10292-y.
The BoneXpert method for automated determination of bone age from hand X-rays was introduced in 2009 and is currently running in over 200 hospitals. The aim of this work is to present version 3 of the method and validate its accuracy and self-validation mechanism that automatically rejects an image if it is at risk of being analysed incorrectly. The training set included 14,036 images from the 2017 Radiological Society of North America (RSNA) Bone Age Challenge, 1642 images of normal Dutch and Californian children, and 8250 images from Tübingen from patients with Short Stature, Congenital Adrenal Hyperplasia and Precocious Puberty. The study resulted in a cross-validated root mean square (RMS) error in the Tübingen images of 0.62 y, compared to 0.72 y in the previous version. The RMS error on the RSNA test set of 200 images was 0.45 y relative to the average of six manual ratings. The self-validation mechanism rejected 0.4% of the RSNA images. 121 outliers among the self-validated images of the Tübingen study were rerated, resulting in 6 cases where BoneXpert deviated more than 1.5 years from the average of the three re-ratings, compared to 72 such cases for the original manual ratings. The accuracy of BoneXpert is clearly better than the accuracy of a single manual rating. The self-validation mechanism rejected very few images, typically with abnormal anatomy, and among the accepted images, there were 12 times fewer severe bone age errors than in manual ratings, suggesting that BoneXpert could be safer than manual rating.
骨龄专家(BoneXpert)方法于 2009 年推出,用于自动从手部 X 光片中测定骨龄,目前已在 200 多家医院使用。本研究旨在介绍该方法的第 3 版,并验证其准确性和自动验证机制,如果图像存在分析错误的风险,该机制将自动拒绝该图像。训练集包括来自 2017 年北美放射学会(Radiological Society of North America,RSNA)骨龄挑战赛的 14036 张图像、1642 张荷兰和加利福尼亚正常儿童的图像以及 8250 张图宾根矮小症、先天性肾上腺皮质增生症和性早熟患者的图像。研究结果表明,图宾根图像的交叉验证均方根(root mean square,RMS)误差为 0.62 岁,而前一版本为 0.72 岁。200 张 RSNA 测试集图像的 RMS 误差为 0.45 岁,相对于 6 次手动评分的平均值。自动验证机制拒绝了 0.4%的 RSNA 图像。对图宾根研究中经过自动验证的图像的 121 个离群值进行重新评分,结果显示有 6 例 BoneXpert 与 3 次重新评分的平均值相差超过 1.5 岁,而原始手动评分中有 72 例出现这种情况。BoneXpert 的准确性明显优于单个手动评分的准确性。自动验证机制拒绝了极少数图像,这些图像通常具有异常解剖结构,而在被接受的图像中,严重骨龄错误的数量比手动评分少 12 倍,这表明 BoneXpert 可能比手动评分更安全。