Whiteside James L, Hijaz Adonis, Imrey Peter B, Barber Matthew D, Paraiso Marie F, Rackley Raymond R, Vasavada Sandip P, Walters Mark D, Daneshgari Firouz
Department of Quantitative Health Sciences, Center for Female Pelvic Medicine, Cleveland Clinic Foundation, Ohio, USA.
Obstet Gynecol. 2006 Aug;108(2):315-23. doi: 10.1097/01.AOG.0000227778.77189.2d.
To estimate the reliability and interobserver consistency of urodynamic interpretations of female bladder and urethral function.
Three urogynecologists and three female urologists at a tertiary care medical center reviewed masked, abstracted clinical and urodynamic information from 100 charts, selected for adequate completeness from a consecutive series of 135 women referred for urodynamic testing. For each of the 100 cases, the reviewers assigned International Continence Society filling and voiding phase diagnoses, and overall clinical diagnoses. Raw agreement proportions and weighted kappa chance-corrected agreement statistics (kappa) were used jointly to describe both reliability and interobserver agreement. Reliability was estimated from duplicate reviews, masked and separated by at least 4 months, of each case by each physician. Interobserver agreement was estimated from comparisons of all pairs of responses from different physicians.
For clinical diagnosis of stress incontinence (present, absent, indeterminate), the within- and across-physician weighted kappa's were, respectively, 0.78 and 0.68. Corresponding results were 0.40 and 0.13 for detrusor overactivity without incontinence, 0.58 and 0.38 for detrusor overactivity with incontinence, and 0.51 and 0.26 for voiding dysfunction. Standard errors of each kappa were between 0.023 and 0.043.
In our group, lower urinary tract diagnoses of stress urinary incontinence from both clinical and urodynamic data demonstrated substantial reliability and interobserver agreement. However, by conventional interpretation of kappa-statistics, reliability of diagnoses of detrusor overactivity or voiding dysfunction was only moderate, and interobserver agreement on these diagnoses was no better than fair. Urodynamic interpretations may not be satisfactorily reproducible for these diagnoses.
评估女性膀胱和尿道功能尿动力学解读的可靠性及观察者间的一致性。
一家三级医疗中心的三名女性泌尿妇科医生和三名女性泌尿科医生,对100份病历中经过筛选、屏蔽的临床和尿动力学信息进行了审查,这些病历是从连续的135名接受尿动力学检查的女性中挑选出来的,具有足够的完整性。对于这100例病例中的每一例,审查人员给出国际尿失禁学会充盈期和排尿期诊断以及总体临床诊断。原始一致比例和加权kappa机会校正一致统计量(kappa)共同用于描述可靠性和观察者间的一致性。可靠性通过每位医生对每个病例至少间隔4个月进行的两次屏蔽审查来评估。观察者间的一致性通过比较不同医生的所有成对反应来评估。
对于压力性尿失禁的临床诊断(存在、不存在、不确定),医生内部和医生之间的加权kappa分别为0.78和0.68。无尿失禁的逼尿肌过度活动对应的结果分别为0.40和0.13,有尿失禁的逼尿肌过度活动为0.58和0.38,排尿功能障碍为0.51和0.26。每个kappa的标准误差在0.023至0.043之间。
在我们的研究组中,基于临床和尿动力学数据对压力性尿失禁的下尿路诊断显示出较高的可靠性和观察者间的一致性。然而,按照kappa统计量的传统解释,逼尿肌过度活动或排尿功能障碍诊断的可靠性仅为中等,观察者间对这些诊断的一致性仅为一般。对于这些诊断,尿动力学解读可能无法令人满意地重复。