Reem Jennifer, Carney Joseph, Stanley Mark, Cassidy Jeffrey
Department of Orthopaedics, Naval Medical Center San Diego, 34800 Bob Wilson Dr. Suite 112, San Diego, CA 92134-1112, USA.
Skeletal Radiol. 2009 Apr;38(4):371-5. doi: 10.1007/s00256-008-0603-8. Epub 2008 Nov 12.
Studies directly evaluating the reliability of the Risser sign are few in number, possess small sample sizes, and offer conflicting results. This study establishes the reliability of the Risser sign on a large sample size in an effort to provide clarification on the subject.
Two years' worth of AP pelvis radiographs from patients age 8-20 were downloaded from our institution's digital imaging system. One hundred of these images were selected for inclusion by an independent reviewer whose goal was to capture a spread of radiographs that included all Risser stages. Risser grading occurred in two rounds. In each round, three examiners randomly reviewed the 100 radiographs on three different occasions. The full AP pelvis radiograph was graded in Round 1 while only the iliac apophysis was visible in Round 2. Kappa coefficients and their confidence bounds are reported to indicate intra- and inter-observer reliability. The contrast between the rates of agreement about Risser stages in Rounds 1 versus 2 was assessed by McNemar's test. The signed-rank test was used to evaluate differences in intra-observer values between rounds.
Round 1 inter-observer kappa was 0.76. Round 2 inter-observer kappa was 0.51. In Round 1, 63 radiographs showed perfect agreement within the same Risser stage for all observations compared to 44 radiographs with perfect agreement within the same Risser stage in Round 2 (p = 0.004). Round 1 intra-observer kappa values were 0.92, 0.86, and 0.88. Round 2 intra-observer kappa values were 0.91, 0.77, and 0.88. Intra-observer value differences between rounds were not significant for two observers (p = 0.074, 0.061) but was significant for the third observer (p = 0.002).
The reliability of the Risser sign is acceptable and can be further improved when other markers of skeletal maturity on the pelvis radiograph are used to assist in grading.
直接评估Risser征可靠性的研究数量较少,样本量小,且结果相互矛盾。本研究以大样本量确定Risser征的可靠性,旨在对该主题进行澄清。
从我们机构的数字成像系统中下载了8至20岁患者两年的前后位骨盆X线片。一名独立审阅者从这些图像中选择了100张纳入研究,其目的是获取涵盖所有Risser分期的一系列X线片。Risser分级分两轮进行。在每一轮中,三名检查者在三个不同时间随机审阅这100张X线片。第一轮对整个前后位骨盆X线片进行分级,而第二轮仅可见髂骨骨骺。报告Kappa系数及其置信区间以表明观察者内和观察者间的可靠性。通过McNemar检验评估第一轮与第二轮中关于Risser分期的一致率之间的差异。使用符号秩检验评估两轮之间观察者内值的差异。
第一轮观察者间Kappa值为0.76。第二轮观察者间Kappa值为0.51。在第一轮中,63张X线片在所有观察中显示同一Risser分期内完全一致,而在第二轮中,同一Risser分期内完全一致的X线片为44张(p = 0.004)。第一轮观察者内Kappa值分别为0.92、0.86和0.88。第二轮观察者内Kappa值分别为0.91、0.77和0.88。两轮之间观察者内值的差异对两名观察者不显著(p = 0.074,0.061),但对第三名观察者显著(p = 0.002)。
Risser征的可靠性是可以接受的,当使用骨盆X线片上其他骨骼成熟度标记辅助分级时,其可靠性可进一步提高。