Institute for Simulation and Interprofessional Studies (ISIS), Seattle Children's, University of Washington School of Medicine, Seattle, WA, USA.
Surgery. 2010 May;147(5):622-30. doi: 10.1016/j.surg.2009.10.068. Epub 2009 Dec 16.
BACKGROUND: Validating assessment tools in surgical simulation training is critical to objectively measuring skills. Most reviews do not elicit methodologies for conducting rigorous validation studies. Our study reports current methodological approaches and proposes benchmark criteria for establishing validity in surgical simulation studies. METHODS: We conducted a systematic review of studies establishing validity. A PubMed search was performed with the following keywords: "validity/validation," "simulation," "surgery," and "technical skills." Descriptors were tabulated for 29 methodological variables by 2 reviewers. RESULTS: A total of 83 studies were included in the review. Of these studies, 60% targeted construct, 24% targeted concurrent, and 5% looked at predictive validity. Less than half (45%) of all the studies reported reliability data. Most studies (82%) were conducted in a single institution with a mean of 37 subjects recruited. Only half of the studies provided rationale for task selection. Data sources included simulator-generated measures (34%), performance assessment by human evaluators (33%), motion tracking (6%), and combined modes (28%). In studies using human evaluators, videotaping was a common (48%) blinding technique; however, 34% of the studies did not blind evaluators. Commonly reported outcomes included task time (86%), economy of motion (51%), technical errors (48%), and number of movements (25%). CONCLUSION: The typical validation study comes from a single institution with a small sample size, lacks clear justification for task selection, omits reliability reporting, and poses potential bias in study design. The lack of standardized validation methodologies creates challenges for training centers that survey the literature to determine the appropriate method for their local settings.
背景:在外科模拟培训中验证评估工具对于客观测量技能至关重要。大多数综述并未提出进行严格验证研究的方法学。我们的研究报告了当前的方法学方法,并提出了在外科模拟研究中建立有效性的基准标准。
方法:我们对建立有效性的研究进行了系统综述。通过以下关键字在 PubMed 上进行了搜索:“有效性/验证”、“模拟”、“手术”和“技术技能”。由两名审阅者对 29 个方法学变量进行了制表描述。
结果:共有 83 项研究被纳入综述。其中,60%的研究针对构念,24%的研究针对同期,5%的研究针对预测有效性。所有研究中只有不到一半(45%)报告了可靠性数据。大多数研究(82%)在单一机构进行,平均招募了 37 名受试者。只有一半的研究提供了任务选择的基本原理。数据来源包括模拟器生成的指标(34%)、人类评估员的绩效评估(33%)、运动跟踪(6%)和组合模式(28%)。在使用人类评估员的研究中,录像通常是一种常见的(48%)盲法技术;然而,34%的研究没有对评估员进行盲法。常报告的结果包括任务时间(86%)、运动经济性(51%)、技术错误(48%)和运动次数(25%)。
结论:典型的验证研究来自于一个小样本量的单一机构,缺乏明确的任务选择理由,缺少可靠性报告,并在研究设计中存在潜在的偏差。缺乏标准化的验证方法学为培训中心带来了挑战,他们需要调查文献以确定适合其本地设置的适当方法。
Surgery. 2009-12-16
Cochrane Database Syst Rev. 2021-4-19
Cochrane Database Syst Rev. 2017-8-22
Health Technol Assess. 2006-9
Cochrane Database Syst Rev. 2008-7-16
Cochrane Database Syst Rev. 2017-3-13
Cochrane Database Syst Rev. 2005-1-25
Cochrane Database Syst Rev. 2018-9-19
Cochrane Database Syst Rev. 2017-12-22
Front Bioeng Biotechnol. 2024-2-9
Children (Basel). 2023-12-28
Medicina (Kaunas). 2023-3-14