Vernez Simone L, Huynh Victor, Osann Kathryn, Okhunov Zhamshid, Landman Jaime, Clayman Ralph V
1 Department of Urology, University of California , Irvine, Orange, California.
2 Hematology-Oncology Division, Department of Medicine, University of California , Irvine, Orange, California.
J Endourol. 2017 Apr;31(S1):S95-S100. doi: 10.1089/end.2016.0569. Epub 2016 Oct 11.
We hypothesized that surgical skills assessment could aid in the selection process of medical student applicants to a surgical program. Recently, crowdsourcing has been shown to provide an accurate assessment of surgical skills at all levels of training. We compared expert and crowd assessment of surgical tasks performed by resident applicants during their interview day at the urology program at the University of California, Irvine.
Twenty-five resident interviewees performed four tasks: open square knot tying, laparoscopic peg transfer, robotic suturing, and skill task 8 on the LAP Mentor™ (Simbionix Ltd., Lod, Israel). Faculty experts and crowd workers (Crowd-Sourced Assessment of Technical Skills [C-SATS], Seattle, WA) assessed recorded performances using the Objective Structured Assessment of Technical Skills (OSATS), Global Evaluative Assessment of Robotic Skills (GEARS), and the Global Operative Assessment of Laparoscopic Skills (GOALS) validated assessment tools.
Overall, 3938 crowd assessments were obtained for the four tasks in less than 3.5 hours, whereas the average time to receive 150 expert assessments was 22 days. Inter-rater agreement between expert and crowd assessment scores was 0.62 for open knot tying, 0.92 for laparoscopic peg transfer, and 0.86 for robotic suturing. Agreement between applicant rank on skill task 8 on the LAP Mentor assessment and crowd assessment was 0.32. The crowd match rank based solely on skills performance did not compare well with the final faculty match rank list (0.46); however, none of the bottom five crowd-rated applicants appeared in the top five expert-rated applicants and none of the top five crowd-rated applicants appeared in the bottom five expert-rated applicants.
Crowd-source assessment of resident applicant surgical skills has good inter-rater agreement with expert physician raters but not with a computer-based objective motion metrics software assessment. Overall applicant rank was affected to some degree by the crowd performance rating.
我们推测手术技能评估有助于医学生申请外科项目的选拔过程。最近,众包已被证明能在各个培训水平上准确评估手术技能。我们比较了专家和众包对加州大学欧文分校泌尿外科项目住院医师申请者面试当天所执行手术任务的评估。
25名住院医师面试者执行了四项任务:开放方结打结、腹腔镜栓钉转移、机器人缝合以及在LAP Mentor™(以色列罗德市Simbionix Ltd.公司)上进行技能任务8。教师专家和众包工作者(华盛顿州西雅图市的技术技能众包评估[C-SATS])使用客观结构化技术技能评估(OSATS)、机器人技能全球评估(GEARS)以及腹腔镜技能全球手术评估(GOALS)等经过验证的评估工具对录制的表现进行评估。
总体而言,在不到3.5小时内就获得了针对这四项任务的3938次众包评估,而获得150次专家评估的平均时间为22天。专家评估分数与众包评估分数之间的评分者间一致性,开放结打结为0.62,腹腔镜栓钉转移为0.92,机器人缝合为0.86。LAP Mentor评估中技能任务8的申请者排名与众包评估之间的一致性为0.32。仅基于技能表现的众包匹配排名与最终教师匹配排名列表相比不太理想(0.46);然而,众包评分垫底的五名申请者中没有一人出现在专家评分靠前的五名申请者中,众包评分靠前的五名申请者中也没有一人出现在专家评分垫底的五名申请者中。
住院医师申请者手术技能的众包评估与专家医师评分者之间具有良好的评分者间一致性,但与基于计算机的客观运动指标软件评估不一致。总体申请者排名在一定程度上受到众包表现评分的影响。