Hatala Rose, Cook David A, Brydges Ryan, Hawkins Richard
Department of Medicine, University of British Columbia, Suite 5907, Burrard Bldg, St. Paul's Hospital, 1081 Burrard St, Vancouver, BC, V6Z 1Y6, Canada.
Mayo Clinic Online Learning and Mayo Clinic Multidisciplinary Simulation Center, Mayo Clinic College of Medicine, Rochester, MN, USA.
Adv Health Sci Educ Theory Pract. 2015 Dec;20(5):1149-75. doi: 10.1007/s10459-015-9593-1. Epub 2015 Feb 22.
In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane's framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.
为了基于凯恩的框架构建和评估客观结构化技术技能评估(OSATS)的效度论证,我们进行了一项系统综述。我们检索了MEDLINE、EMBASE、CINAHL、PsycINFO、ERIC、科学引文索引、Scopus,并检索了截至2013年2月的选定参考文献列表。我们两人一组,选择了任何语言的原创研究文章,这些文章将OSATS评估为任何卫生专业人员的评估工具。我们反复合作,从纳入的文章中提取效度证据,以构建和评估OSATS不同用途的效度论证。29篇文章符合纳入标准,均聚焦于手术技术技能评估。我们确定了OSATS的三种预期用途,即形成性反馈、高风险评估和项目评估。按照凯恩的框架,我们检查了效度论证中的四个推断(评分、概括、外推、决策)。对于形成性反馈和高风险评估,有合理的评分和外推证据。然而,对于高风险评估,除了评分者间信度数据外,缺乏概括证据,且缺乏将多站OSATS分数与实际临床环境中的表现联系起来的证据。对于项目评估,OSATS效度论证得到了合理的概括和外推证据的支持。完全缺乏基于OSATS分数的影响和决策的证据。总体而言,效度证据支持将OSATS用于形成性反馈。如果要将OSATS用于更高风险的决策和项目评估,就需要开展研究以支持基于OSATS分数的决策。