Andreatta Pamela B, Woodrum Derek T, Gauger Paul G, Minter Rebecca M
Department of Medical Education and Surgery, University of Michigan, Ann Arbor, Michigan 48109-5329, USA.
Simul Healthc. 2008 Spring;3(1):16-25. doi: 10.1097/SIH.0b013e31816366b9.
Many surgical training programs are introducing virtual-reality laparoscopic simulators into their curriculum. If a surgical simulator will be used to determine when a trainee has reached an "expert" level of performance, its evaluation metrics must accurately reflect varying levels of skill. The ability of a metric to differentiate novice from expert performance is referred to as construct validity. The present study was undertaken to determine whether the LapMentor's metrics demonstrate construct validity.
Medical students, residents and faculty laparoscopic surgeons (n = 5-14 per group) performed 5 consecutive repetitions of 6 laparoscopic skills tasks: 30 degrees Camera Manipulation, Eye-Hand Coordination, Clipping/Grasping, Cutting, Electrocautery, and Translocation of Objects. The LapMentor measured performance in 4 to 12 parameters per task. Mean performance for each parameter was compared between subject groups for the first and fifth repetitions. Pairwise comparisons among the 3 groups were made by post hoc t-tests with Bonferroni technique. Significance was set at P < 0.05.
Of the 6 tasks evaluated, only the Eye-Hand Coordination task (3/12 parameters) and the Clipping and Grasping (1/7 parameters) had expert-level discrimination when performance was compared after completion of 1 repetition. Comparison of the fifth repetition performance (representing the plateau of the learning curves), demonstrated that the parameters Time and Score had expert level discrimination on the Eye-Hand Coordination task, and Time on the Cutting task. The remaining LapMentor tasks evaluated did not exhibit the ability to differentiate level of expertise based on the built-in metrics on either repetition 1 or 5.
The majority of the LapMentor tasks' metrics were unable to differentiate between laparoscopic experts and less skilled subjects. Therefore, performance on those tasks may not accurately reflect a subject's true level of ability. Feedback to the manufacturer about these findings may encourage the development of evaluation parameters with greater sensitivity.
许多外科培训项目正在将虚拟现实腹腔镜模拟器引入其课程。如果要使用手术模拟器来确定学员何时达到“专家”级别的操作水平,其评估指标必须准确反映不同的技能水平。一种指标区分新手和专家操作的能力被称为结构效度。本研究旨在确定LapMentor的指标是否具有结构效度。
医学生、住院医师和在职腹腔镜外科医生(每组n = 5 - 14)对6项腹腔镜技能任务连续进行5次重复操作:30度镜头操作、眼手协调、夹取/抓取、切割、电灼和物体移位。LapMentor对每个任务的4至12个参数进行性能测量。比较了各受试者组在第一次和第五次重复操作时每个参数的平均性能。采用Bonferroni技术进行事后t检验,对3组进行两两比较。显著性水平设定为P < 0.05。
在评估的6项任务中,完成1次重复操作后比较性能时,只有眼手协调任务(12个参数中的3个)和夹取与抓取任务(7个参数中的1个)具有专家级别的区分度。比较第五次重复操作的性能(代表学习曲线的平稳期)表明,参数“时间”和“得分”在眼手协调任务上具有专家级别的区分度,“时间”在切割任务上具有专家级别的区分度。评估的其余LapMentor任务在第一次或第五次重复操作时,基于内置指标均未表现出区分专业水平的能力。
LapMentor大多数任务的指标无法区分腹腔镜专家和技能较低的受试者。因此,这些任务的操作可能无法准确反映受试者的真实能力水平。向制造商反馈这些发现可能会促使开发出更具敏感性的评估参数。