Department of Urologic Sciences, University of British Columbia, Vancouver, Canada.
J Urol. 2012 Oct;188(4 Suppl):1490-2. doi: 10.1016/j.juro.2012.02.015. Epub 2012 Aug 18.
The International Reflux Committee proposed a grading system for vesicoureteral reflux in 1985 which has been used extensively in everyday practice and research studies. Despite widespread use, based mainly on face validity, the interrater and intrarater reliability of this tool are not known. A tool cannot be considered valid unless it is reliable. Therefore, we estimated the interrater and intrarater reliability of the international grading system for vesicoureteral reflux.
A series of 28 voiding cystourethrogram studies were selected. The images were assembled in an electronic presentation in random fashion. Four pediatric radiologists, 5 pediatric urologists and 4 senior urology residents graded the studies. The images were then shuffled in a random fashion and re-rated after 7 days (total 728 observations). Cohen weighted kappa statistics were used to determine interrater and intrarater reliability. Subgroup analysis was then performed comparing the variability among the 3 groups of raters and different grades.
The average interrater reliability was 0.53 (95% CI 0.52-0.55, p <0.0001). Agreement in subgroups was 0.61 for urologists, 0.59 for residents and 0.56 for radiologists. The lowest agreement was shown in grade III (0.36) and the highest in grade I (0.98). The intrarater reliability was 0.86 (95% CI 0.77-0.95, p <0.001).
The international grading system for vesicoureteral reflux shows low interrater reliability for moderate degrees of vesicoureteral reflux whereas the intrarater reliability is high. Modification of this system may improve its reproducibility.
1985 年,国际反流委员会提出了一种用于膀胱输尿管反流的分级系统,该系统在日常实践和研究中得到了广泛应用。尽管该分级系统得到了广泛应用,但主要基于表面有效性,其评价者间和评价者内的可靠性尚不清楚。如果一个工具不可靠,就不能被认为是有效的。因此,我们评估了国际膀胱输尿管反流分级系统的评价者间和评价者内的可靠性。
选择了一系列 28 例排尿性膀胱尿道造影研究。将图像以随机方式组合在电子演示文稿中。4 名儿科放射科医生、5 名小儿泌尿科医生和 4 名高级泌尿科住院医师对研究进行了分级。然后以随机方式打乱图像,并在 7 天后(共 728 次观察)重新分级。使用 Cohen 加权κ统计量来确定评价者间和评价者内的可靠性。然后进行亚组分析,比较 3 组评价者和不同分级之间的变异性。
平均评价者间可靠性为 0.53(95%CI 0.52-0.55,p<0.0001)。亚组间的一致性为泌尿科医生 0.61,住院医师 0.59,放射科医生 0.56。等级 III 的一致性最低(0.36),等级 I 的一致性最高(0.98)。评价者内的可靠性为 0.86(95%CI 0.77-0.95,p<0.001)。
国际膀胱输尿管反流分级系统对中度膀胱输尿管反流的评价者间可靠性较低,而评价者内的可靠性较高。对该系统进行修改可能会提高其可重复性。