Conkling Nicole, Bishawi Muath, Phillips Brett T, Bui Duc T, Khan Sami U, Dagum Alexander B
Division of Plastic Surgery, Stony Brook University Medical Center, Stony Brook, NY, USA.
Ann Plast Surg. 2012 Oct;69(4):350-5. doi: 10.1097/SAP.0b013e31824a43e0.
Throughout the literature, investigators have assessed the cosmetic efficacy of botulinum toxin (BT) treatment by using various subjective, qualitative measures, including the Facial Wrinkle Scale (FWS) and Subject Global Assessment (SGA). The widely used FWS and SGA attempt to quantify both the magnitude and duration of cosmetic outcomes as assessed by physician and patient. We sought to determine the interobserver validity of these scales relative to the level of observer experience.
Botulinum toxin injections were performed to cosmetic effect in 6 patients recruited as part of an institutional review board-approved investigation. Subjects were photographed at rest and during animation (raising eyebrows, frowning, and blinking) before treatment and at 1, 2, 4 weeks, and monthly with follow-up to 6 months. Standardized digital 8″×10″ prints were scored using the FWS by board-certified plastic surgeons (n=5), general surgery residents (n=3), and medical students (n=4). Photographs at each time point were then compared to baseline using the SGA. Statistical analysis of observer data was performed using SPSS v19. Cohen κ (FWS) and Spearman ρ (SGA) were calculated for each pairwise comparison of observer data, with a conservative α of 0.01.
The FWS observer scores for the upper face overall were generally in agreement, with no negative κ values. The distribution, even among members of a single group, was highly variable. Agreement among plastic surgeons was the greatest (κ, 0.194-0.609). Resident concordance was moderate, and medical students displayed the most variable agreement. Spearman ρ for SGA scores was much higher, with surgeons approaching excellent agreement (κ, 0.443-0.992). In comparisons between members of different groups, agreement was unpredictable for both the FWS and SGA. Comparisons using scores from individual areas of the face were least concordant.
The FWS and SGA represent the current standard of cosmetic outcomes measures; however, when subjected to scrutiny they display relatively unpredictable agreement even among plastic surgeons. Compared to the FWS, the SGA has a more acceptable user concordance, especially among plastic surgeons accustomed to using such scales. The interobserver variability of FWS and SGA scoring underlines the need to explore objective, quantitative cosmetic outcomes measures.
在整个文献中,研究人员通过使用各种主观、定性的方法来评估肉毒杆菌毒素(BT)治疗的美容效果,包括面部皱纹量表(FWS)和主观整体评估(SGA)。广泛使用的FWS和SGA试图量化医生和患者评估的美容效果的程度和持续时间。我们试图确定这些量表相对于观察者经验水平的观察者间效度。
作为机构审查委员会批准的一项调查的一部分,对6名患者进行肉毒杆菌毒素注射以达到美容效果。在治疗前以及治疗后1周、2周、4周和每月直至6个月的随访期间,对受试者在休息时和表情动作(皱眉、皱眉和眨眼)时进行拍照。由经过委员会认证的整形外科医生(n = 5)、普通外科住院医师(n = 3)和医学生(n = 4)使用FWS对标准化的8英寸×10英寸数码照片进行评分。然后使用SGA将每个时间点的照片与基线进行比较。使用SPSS v19对观察者数据进行统计分析。对观察者数据的每对比较计算Cohen κ(FWS)和Spearman ρ(SGA),保守的α值为0.01。
总体上,FWS对上面部的观察者评分基本一致,没有负κ值。即使在单个组的成员之间,分布也高度可变。整形外科医生之间的一致性最高(κ,0.194 - 0.609)。住院医师的一致性中等,医学生的一致性变化最大。SGA评分的Spearman ρ值要高得多,外科医生的一致性接近优秀(κ,0.443 - 0.992)。在不同组成员之间的比较中,FWS和SGA的一致性都不可预测。使用面部各个区域的分数进行的比较一致性最差。
FWS和SGA代表了目前美容效果测量的标准;然而,经过仔细审查后发现,即使在整形外科医生中,它们的一致性也相对不可预测。与FWS相比,SGA具有更可接受的用户一致性,特别是在习惯于使用此类量表的整形外科医生中。FWS和SGA评分的观察者间变异性强调了探索客观、定量的美容效果测量方法的必要性。