Department of Radiology, Beijing Jishuitan Hospital Capital Medical University, Beijing, China.
Department of Traumatology, Beijing Jishuitan Hospital Capital Medical University, Beijing, China.
Skeletal Radiol. 2024 Dec;53(12):2635-2642. doi: 10.1007/s00256-024-04664-w. Epub 2024 May 2.
To determine which bones and which grades had the highest inter-rater variability when employing the Tanner-Whitehouse (T-W) method.
Twenty-four radiologists were recruited and trained in the T-W classification of skeletal development. The consistency and skill of the radiologists in determining bone development status were assessed using 20 pediatric hand radiographs of children aged 1 to 18 years old. Four radiologists had a poor concordance rate and were excluded. The remaining 20 radiologists undertook a repeat reading of the radiographs, and their results were analyzed by comparing them with the mean assessment of two senior experts as the reference standard. Concordance rate, scoring, and Kendall's W were calculated to evaluate accuracy and consistency.
Both the radius, ulna, and short finger (RUS) system (Kendall's W = 0.833) and the carpal (C) system (Kendall's W = 0.944) had excellent consistency, with the RUS system outperforming the C system in terms of scores. The repeatability analysis showed that the second rating test, performed after 2 months of further bone age assessment (BAA) practice, was more consistent and accurate than the first. The capitate had the lowest average concordance rate and scoring, as well as the lowest overall concordance rate for its D classification. Moreover, the G classifications of the seven carpal bones all had a concordance rate less than 0.6. The bones with lower Kendall's W were likewise those with lower scores and concordance rates.
The D grade of the capitate showed the highest variation, and the use of the Tanner-Whitehouse 3rd edition (T-W3) to determine bone age (BA) was frequently inconsistent. A more comprehensive description with a focus on inaccuracy bones or ratings and a modification to the T-W3 approach would significantly advance BAA.
确定在使用 Tanner-Whitehouse(T-W)法时,哪些骨骼和等级的观察者间变异最大。
招募了 24 名放射科医生,并对其进行了 T-W 骨骼发育分类的培训。通过对 20 名 1 至 18 岁儿童的手部 X 光片进行评估,评估放射科医生确定骨骼发育状态的一致性和技能。有 4 名放射科医生的一致性率较差,被排除在外。其余 20 名放射科医生对 X 光片进行了重复阅读,并将其结果与两位高级专家的平均评估进行比较,作为参考标准进行分析。计算一致性率、评分和 Kendall's W,以评估准确性和一致性。
桡骨、尺骨和短指(RUS)系统(Kendall's W=0.833)和腕骨(C)系统(Kendall's W=0.944)均具有极好的一致性,RUS 系统在评分方面优于 C 系统。重复测试分析表明,在进行了 2 个月的进一步骨龄评估(BAA)实践后进行的第二次评分测试更一致和准确。头状骨的平均一致性率和评分最低,其 D 分级的整体一致性率也最低。此外,七块腕骨的 G 分级的一致性率均低于 0.6。Kendall's W 较低的骨骼,其评分和一致性率也较低。
头状骨的 D 级显示出最高的变异,使用 Tanner-Whitehouse 第 3 版(T-W3)来确定骨龄(BA)常常不一致。使用更全面的描述,重点关注不准确的骨骼或评分,并对 T-W3 方法进行修改,将显著推进 BAA。