Suppr超能文献

运用概化理论探究技术技能评估中变异性的可靠性和来源:系统评价和荟萃分析。

Use of Generalizability Theory for Exploring Reliability of and Sources of Variance in Assessment of Technical Skills: A Systematic Review and Meta-Analysis.

机构信息

S.A.W. Andersen is postdoctoral researcher, Copenhagen Academy for Medical Education and Simulation (CAMES), Center for Human Resources and Education, Capital Region of Denmark, and Department of Otolaryngology, The Ohio State University, Columbus, Ohio, and resident in otorhinolaryngology, Department of Otorhinolaryngology-Head & Neck Surgery, Rigshospitalet, Copenhagen, Denmark; ORCID: https://orcid.org/0000-0002-3491-9790 .

L.J. Nayahangan is researcher, CAMES, Center for Human Resources and Education, Capital Region of Denmark, Copenhagen, Denmark; ORCID: https://orcid.org/0000-0002-6179-1622 .

出版信息

Acad Med. 2021 Nov 1;96(11):1609-1619. doi: 10.1097/ACM.0000000000004150.

Abstract

PURPOSE

Competency-based education relies on the validity and reliability of assessment scores. Generalizability (G) theory is well suited to explore the reliability of assessment tools in medical education but has only been applied to a limited extent. This study aimed to systematically review the literature using G-theory to explore the reliability of structured assessment of medical and surgical technical skills and to assess the relative contributions of different factors to variance.

METHOD

In June 2020, 11 databases, including PubMed, were searched from inception through May 31, 2020. Eligible studies included the use of G-theory to explore reliability in the context of assessment of medical and surgical technical skills. Descriptive information on study, assessment context, assessment protocol, participants being assessed, and G-analyses was extracted. Data were used to map G-theory and explore variance components analyses. A meta-analysis was conducted to synthesize the extracted data on the sources of variance and reliability.

RESULTS

Forty-four studies were included; of these, 39 had sufficient data for meta-analysis. The total pool included 35,284 unique assessments of 31,496 unique performances of 4,154 participants. Person variance had a pooled effect of 44.2% (95% confidence interval [CI], 36.8%-51.5%). Only assessment tool type (Objective Structured Assessment of Technical Skills-type vs task-based checklist-type) had a significant effect on person variance. The pooled reliability (G-coefficient) was 0.65 (95% CI, .59-.70). Most studies included decision studies (39, 88.6%) and generally seemed to have higher ratios of performances to assessors to achieve a sufficiently reliable assessment.

CONCLUSIONS

G-theory is increasingly being used to examine reliability of technical skills assessment in medical education, but more rigor in reporting is warranted. Contextual factors can potentially affect variance components and thereby reliability estimates and should be considered, especially in high-stakes assessment. Reliability analysis should be a best practice when developing assessment of technical skills.

摘要

目的

基于能力的教育依赖于评估分数的有效性和可靠性。广义理论(G 理论)非常适合探索医学教育中评估工具的可靠性,但仅在有限的范围内得到了应用。本研究旨在使用 G 理论系统地回顾文献,以探讨医学和外科技术技能结构化评估的可靠性,并评估不同因素对变异的相对贡献。

方法

2020 年 6 月,从建库到 2020 年 5 月 31 日,通过 11 个数据库(包括 PubMed)进行检索。符合条件的研究包括使用 G 理论探索评估医学和外科技术技能背景下的可靠性,并提取研究、评估背景、评估方案、评估对象和 G 分析的描述性信息。使用数据进行 G 理论映射并探索方差分量分析。对提取的关于变异和可靠性来源的数据进行荟萃分析。

结果

共纳入 44 项研究;其中 39 项有足够的数据进行荟萃分析。总共有 35284 项独特评估的 31496 项独特表现的 4154 名参与者。个体方差的总效应为 44.2%(95%置信区间[CI],36.8%-51.5%)。只有评估工具类型(客观结构化技能评估型与任务型检查表型)对个体方差有显著影响。综合可靠性(G 系数)为 0.65(95% CI,0.59-0.70)。大多数研究包括决策研究(39 项,88.6%),并且通常似乎具有更高的表现与评估者比例,以实现足够可靠的评估。

结论

G 理论越来越多地用于检验医学教育中技术技能评估的可靠性,但需要更严格地报告。上下文因素可能会潜在影响变异分量,从而影响可靠性估计,因此应予以考虑,尤其是在高风险评估中。在开发技术技能评估时,可靠性分析应成为最佳实践。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验