Department of Surgery, Stanford University School of Medicine, Stanford, California.
Department of Surgery, Stanford University School of Medicine, Stanford, California.
J Surg Educ. 2022 Jan-Feb;79(1):206-215. doi: 10.1016/j.jsurg.2021.07.002. Epub 2021 Aug 3.
The gold standard for evaluation of resident procedural competence is that of validated assessments from faculty surgeons. A provision of adequate trainee assessments is challenged by a shortage of faculty due to increased clinical and administrative responsibilities. We hypothesized that with a well constructed assessment instrument and training, there would be minimal differences in procedural assessments made by near-peer resident raters (RR), faculty raters (FR), and trained raters (TR).
Deidentified videos of residents performing hand-sewn (HA) and stapled (SA) anastomoses were distributed to blinded reviewers of 3 types. Intra-class correlation (ICC) of RR, FR and TR assessments was determined for each procedure. A fully-crossed design was used to examine the internal structure validity in a generalizability study. A Decision study was performed to make projections on the number of raters needed for a g-coefficient > 0.70.
This study was conducted within a private academic institution, using the creation of intestinal anastomoses as the procedural model.
Raters consisted of residents who were untrained to the assessment (UTA) tool, UTA faculty surgeons, and individuals with training.
Twenty nine videos were reviewed (15 HA and 14 SA) by a total of 9 video reviewers (4 RR, 2 FR, and 3 TR). HA ICC values were 0.84 (Confidence Interval [CI]:0.81-0.87) for RR, 0.89 (CI:0.86-0.92) for FR, and 0.88 (CI:0.86-0.90) for TR. SA ICC values were 0.77 (CI:0.72-0.80) for RR, 0.79 (CI:0.75-0.83) for FR, and 0.86 (CI:0.83-0.88) for TR. The g-coefficient was RR = 0.72, FR = 0.85, and TR = 0.77 for HA; and RR = 0.33, FR = 0.38, and TR = 0.4 for SA. The D-study indicated that at least 2 raters of any type were needed for HA and > 11 FR for SA.
Faculty without training have high assessment agreement. Peers for surgical skills assessment is an option for formative evaluation without training. Training to assessment tools should be performed for any assessment, formative or summative, for the optimal evaluation of procedural competence.
评估住院医师手术能力的金标准是外科教员的验证评估。由于临床和行政职责的增加,教员人数不足,这对提供足够的学员评估提出了挑战。我们假设,使用精心构建的评估工具和培训,由近乎同行的住院医师评估者(RR)、教员评估者(FR)和经过培训的评估者(TR)进行的程序评估差异最小。
将居民进行手工缝合(HA)和吻合钉缝合(SA)吻合的视频分发给 3 种类型的盲审员。确定了每种程序的 RR、FR 和 TR 评估的组内相关系数(ICC)。使用完全交叉设计在可推广性研究中检查内部结构有效性。进行决策研究以预测达到 g 系数>0.70 所需的评估者数量。
这项研究是在一个私立学术机构内进行的,使用肠吻合术作为程序模型。
评估者包括未接受评估(UTA)工具培训的住院医师、UTA 外科教员和接受培训的人员。
共有 9 名视频评估者(4 名 RR、2 名 FR 和 3 名 TR)对 29 个视频(15 个 HA 和 14 个 SA)进行了审查。HA 的 ICC 值分别为 RR 的 0.84(置信区间[CI]:0.81-0.87)、FR 的 0.89(CI:0.86-0.92)和 TR 的 0.88(CI:0.86-0.90)。SA 的 ICC 值分别为 RR 的 0.77(CI:0.72-0.80)、FR 的 0.79(CI:0.75-0.83)和 TR 的 0.86(CI:0.83-0.88)。HA 的 g 系数为 RR=0.72、FR=0.85、TR=0.77,SA 的 g 系数为 RR=0.33、FR=0.38、TR=0.4。D 研究表明,HA 需要至少 2 名任何类型的评估者,而 SA 需要>11 名 FR。
未经培训的教员具有较高的评估一致性。对于无培训的形成性评估,同伴是外科技能评估的一种选择。应针对任何评估(形成性或总结性)进行评估工具培训,以最佳评估程序能力。