Department of Orthopaedic Surgery, Great Ormond Street Hospital for Children, Institute of Child Health, University College London, London WC1N 3JH, England, UK.
Clin Orthop Relat Res. 2012 Dec;470(12):3499-505. doi: 10.1007/s11999-012-2534-x. Epub 2012 Aug 18.
Osteonecrosis is perhaps the most important serious complication after treatment of developmental dysplasia of the hip (DDH). The classification by Bucholz and Ogden has been used most frequently for grading osteonecrosis in this context, but its reliability is not established and unreliability could affect the validity of studies reporting the outcome of treatment.
QUESTIONS/PURPOSE: We established the interrater and intrarater reliabilities of this classification and analyzed the frequency and nature of disagreements.
Three pediatric hip surgeons, a musculoskeletal pediatric radiologist, and three orthopaedic trainees graded 39 radiographs (hips) according to the Bucholz and Ogden classification, blinded to any clinical data. Ratings were repeated after 2 weeks. Interrater reliability and intrarater reliability were determined using the simple kappa statistic. Grading was compared among raters, the nature and frequency of disagreements established, and subgroup analyses performed.
Interrater reliability was 0.34 (95% CI = 0.28, 0.40) for all raters, and 0.31 (0.20 to 0.43) for the three surgeons. The best interrater reliability was observed between the radiologist and a surgeon with a kappa of 0.51 (0.30, 0.72). Intrarater reliability estimates ranged from 0.44 to 0.69. Raters disagreed regarding the grade of osteonecrosis in 26 of 39 hips (67%), with seven of 26 disagreements (27%) involving confusion between Grades I and II.
The interrater reliability was lower than expected, considering the raters' experience. Distinguishing between Grades I and II was the most frequently observed problem. We believe that the low reliability was a result of an ambiguous classification scheme rather than the variability among the raters. Outcome studies of DDH based on this classification should be interpreted with caution. We recommend the development of a new classification with better prognostic ability.
Level III, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
骨坏死或许是发育性髋关节发育不良(DDH)治疗后最重要的严重并发症。在这种情况下,Bucholz 和 Ogden 的分类法最常用于对骨坏死进行分级,但该分类法的可靠性尚未确定,其不可靠性可能会影响报告治疗结果的研究的有效性。
问题/目的:我们确定了该分类法的组内和组间可靠性,并分析了意见分歧的频率和性质。
三位小儿髋关节外科医生、一位肌肉骨骼小儿放射科医生和三位骨科住院医师对 39 张(髋关节)X 线片根据 Bucholz 和 Ogden 分类进行了分级,对任何临床数据均不知情。2 周后重复评分。使用简单κ统计量确定组内和组间可靠性。对评分者之间的分级进行比较,确定意见分歧的性质和频率,并进行亚组分析。
所有评分者的组间可靠性为 0.34(95%CI=0.28,0.40),三位外科医生的组间可靠性为 0.31(0.20 至 0.43)。放射科医生与一位外科医生之间的组间可靠性最佳,κ 值为 0.51(0.30,0.72)。组内可靠性估计值范围为 0.44 至 0.69。在 39 个髋关节中,有 26 个(67%)的评分者意见不一致,26 个分歧中有 7 个(27%)涉及 I 级和 II 级之间的混淆。
考虑到评分者的经验,组间可靠性低于预期。区分 I 级和 II 级是最常观察到的问题。我们认为,低可靠性是分类方案不明确的结果,而不是评分者之间的差异。基于该分类的 DDH 结果研究应谨慎解释。我们建议开发一种具有更好预后能力的新分类。
III 级,诊断研究。有关证据水平的完整描述,请参见作者指南。