Power Stephanie M, Matic Damir B
Cleft Palate Craniofac J. 2013 Mar;50(2):144-9. doi: 10.1597/11-104. Epub 2012 Mar 19.
Objective : Cleft surgeons often show 10 consecutive lip repairs to reduce presentation bias, however the validity remains unknown. The purpose of this study is to determine the number of consecutive cases that represent average outcomes. Secondary objectives are to determine if outcomes correlate with cleft severity and to calculate interrater reliability. Design : Consecutive preoperative and 2-year postoperative photographs of the unilateral cleft lip-nose complex were randomized and evaluated by cleft surgeons. Parametric analysis was performed according to chronologic, consecutive order. The mean standard deviation over all raters enabled calculation of expected 95% confidence intervals around a mean tested for various sample sizes. Setting : Meeting of the American Cleft Palate-Craniofacial Association in 2009. Patients, Participants : Ten senior cleft surgeons evaluated 39 consecutive lip repairs. Main Outcome Measures : Preoperative severity and postoperative outcomes were evaluated using descriptive and quantitative scales. Results : Intraclass correlation coefficients for cleft severity and postoperative evaluations were 0.65 and 0.21, respectively. Outcomes did not correlate with cleft severity (P = .28). Calculations for 10 consecutive cases demonstrated wide 95% confidence intervals, spanning two points on both postoperative grading scales. Ninety-five percent confidence intervals narrowed within one qualitative grade (±0.30) and one point (±0.50) on the 10-point scale for 27 consecutive cases. Conclusions : Larger numbers of consecutive cases (n > 27) are increasingly representative of average results, but less practical in presentation format. Ten consecutive cases lack statistical support. Cleft surgeons showed low interrater reliability for postoperative assessments, which may reflect personal bias when evaluating another surgeon's results.
唇裂外科医生常展示连续10例唇裂修复手术以减少展示偏倚,但其有效性尚不清楚。本研究的目的是确定代表平均结果的连续病例数。次要目的是确定结果是否与唇裂严重程度相关,并计算评分者间信度。设计:对单侧唇裂鼻复合体的术前和术后2年连续照片进行随机分组,并由唇裂外科医生进行评估。根据时间顺序进行参数分析。通过所有评分者的平均标准差,能够计算出针对不同样本量所测试均值周围的预期95%置信区间。背景:2009年美国腭裂-颅面协会会议。患者、参与者:10位资深唇裂外科医生评估了39例连续的唇裂修复手术。主要观察指标:使用描述性和定量量表评估术前严重程度和术后结果。结果:唇裂严重程度和术后评估的组内相关系数分别为0.65和0.21。结果与唇裂严重程度无关(P = 0.28)。对10例连续病例的计算显示95%置信区间较宽,在两个术后分级量表上跨越两个点。对于27例连续病例,95%置信区间在10分制量表上缩小至一个定性等级(±0.30)和一个点(±0.50)。结论:更多数量的连续病例(n > 27)越来越能代表平均结果,但在展示形式上不太实用。连续10例病例缺乏统计学支持。唇裂外科医生在术后评估中显示出较低的评分者间信度,这可能反映了评估其他外科医生结果时的个人偏倚。