Ariadne Labs: a joint center for health system innovation at the Brigham and Women's Hospital and Harvard School of Public Health, Boston, Massachusetts, USA Department of Surgery, Stanford University School of Medicine, Stanford, California, USA.
Department of Health Policy and Management, Harvard School of Public Health, Boston, Massachusetts, USA Tanana Valley Clinic, Fairbanks, Alaska, USA.
BMJ Qual Saf. 2014 Aug;23(8):639-50. doi: 10.1136/bmjqs-2013-002446. Epub 2014 Feb 4.
To assess the inter-rater reliability (IRR) of two novel observation tools for measuring surgical safety checklist performance and teamwork.
Data surgical safety checklists can promote adherence to standards of care and improve teamwork in the operating room. Their use has been associated with reductions in mortality and other postoperative complications. However, checklist effectiveness depends on how well they are performed.
Authors from the Safe Surgery 2015 initiative developed a pair of novel observation tools through literature review, expert consultation and end-user testing. In one South Carolina hospital participating in the initiative, two observers jointly attended 50 surgical cases and independently rated surgical teams using both tools. We used descriptive statistics to measure checklist performance and teamwork at the hospital. We assessed IRR by measuring percent agreement, Cohen's κ, and weighted κ scores.
The overall percent agreement and κ between the two observers was 93% and 0.74 (95% CI 0.66 to 0.79), respectively, for the Checklist Coaching Tool and 86% and 0.84 (95% CI 0.77 to 0.90) for the Surgical Teamwork Tool. Percent agreement for individual sections of both tools was 79% or higher. Additionally, κ scores for six of eight sections on the Checklist Coaching Tool and for two of five domains on the Surgical Teamwork Tool achieved the desired 0.7 threshold. However, teamwork scores were high and variation was limited. There were no significant changes in the percent agreement or κ scores between the first 10 and last 10 cases observed.
Both tools demonstrated substantial IRR and required limited training to use. These instruments may be used to observe checklist performance and teamwork in the operating room. However, further refinement and calibration of observer expectations, particularly in rating teamwork, could improve the utility of the tools.
评估两种用于测量手术安全检查表执行情况和团队合作的新型观察工具的组内一致性(IRR)。
数据手术安全检查表可以促进对护理标准的遵守,并改善手术室的团队合作。其使用与死亡率和其他术后并发症的降低有关。然而,检查表的有效性取决于其执行情况。
Safe Surgery 2015 倡议的作者通过文献回顾、专家咨询和最终用户测试开发了一对新型观察工具。在参与该倡议的南卡罗来纳州的一家医院中,两名观察员共同参加了 50 例手术,并使用这两种工具独立评估手术团队。我们使用描述性统计来衡量医院的检查表执行情况和团队合作情况。我们通过测量百分比一致性、Cohen's κ 和加权κ分数来评估 IRR。
两名观察员之间的总体百分比一致性和κ值分别为 93%和 0.74(95%CI 0.66 至 0.79),用于检查表辅导工具,以及 86%和 0.84(95%CI 0.77 至 0.90),用于手术团队工具。两种工具的各个部分的百分比一致性均为 79%或更高。此外,检查表辅导工具的八个部分中的六个以及手术团队工具的五个领域中的两个的κ分数达到了所需的 0.7 阈值。然而,团队合作分数较高,且变化有限。在前 10 例和后 10 例观察中,百分比一致性或κ分数均无显著变化。
两种工具均表现出相当大的 IRR,且使用时仅需进行有限的培训。这些工具可用于观察手术室中的检查表执行情况和团队合作。然而,进一步细化和校准观察者的期望,特别是在评估团队合作方面,可以提高工具的实用性。