99 名麻醉住院医师在 8 所机构参加的模拟客观结构化临床考试的评估分数。

Assessment Scores of a Mock Objective Structured Clinical Examination Administered to 99 Anesthesiology Residents at 8 Institutions.

机构信息

From the Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, California.

Department of Medical Education, University of Illinois at Chicago College of Medicine, Chicago, Illinois.

出版信息

Anesth Analg. 2020 Aug;131(2):613-621. doi: 10.1213/ANE.0000000000004705.

BACKGROUND

Objective Structured Clinical Examinations (OSCEs) are used in a variety of high-stakes examinations. The primary goal of this study was to examine factors influencing the variability of assessment scores for mock OSCEs administered to senior anesthesiology residents.

METHODS

Using the American Board of Anesthesiology (ABA) OSCE Content Outline as a blueprint, scenarios were developed for 4 of the ABA skill types: (1) informed consent, (2) treatment options, (3) interpretation of echocardiograms, and (4) application of ultrasonography. Eight residency programs administered these 4 OSCEs to CA3 residents during a 1-day formative session. A global score and checklist items were used for scoring by faculty raters. We used a statistical framework called generalizability theory, or G-theory, to estimate the sources of variation (or facets), and to estimate the reliability (ie, reproducibility) of the OSCE performance scores. Reliability provides a metric on the consistency or reproducibility of learner performance as measured through the assessment.

RESULTS

Of the 115 total eligible senior residents, 99 participated in the OSCE because the other residents were unavailable. Overall, residents correctly performed 84% (standard deviation [SD] 16%, range 38%-100%) of the 36 total checklist items for the 4 OSCEs. On global scoring, the pass rate for the informed consent station was 71%, for treatment options was 97%, for interpretation of echocardiograms was 66%, and for application of ultrasound was 72%. The estimate of reliability expressing the reproducibility of examinee rankings equaled 0.56 (95% confidence interval [CI], 0.49-0.63), which is reasonable for normative assessments that aim to compare a resident's performance relative to other residents because over half of the observed variation in total scores is due to variation in examinee ability. Phi coefficient reliability of 0.42 (95% CI, 0.35-0.50) indicates that criterion-based judgments (eg, pass-fail status) cannot be made. Phi expresses the absolute consistency of a score and reflects how closely the assessment is likely to reproduce an examinee's final score. Overall, the greatest (14.6%) variance was due to the person by item by station interaction (3-way interaction) indicating that specific residents did well on some items but poorly on other items. The variance (11.2%) due to residency programs across case items was high suggesting moderate variability in performance from residents during the OSCEs among residency programs.

CONCLUSIONS

Since many residency programs aim to develop their own mock OSCEs, this study provides evidence that it is possible for programs to create a meaningful mock OSCE experience that is statistically reliable for separating resident performance.

背景

客观结构化临床考试（OSCE）在各种高风险考试中得到广泛应用。本研究的主要目的是检验影响高级麻醉住院医师模拟 OSCE 评估分数变异性的因素。

方法

使用美国麻醉医师协会（ABA）OSCE 内容大纲作为蓝图，为 4 个 ABA 技能类型开发了场景：（1）知情同意，（2）治疗选择，（3）超声心动图解读，和（4）超声应用。8 个住院医师培训计划在为期 1 天的形成性会议期间为 CA3 住院医师实施了这 4 个 OSCE。教师评分者使用全球评分和清单项目进行评分。我们使用一种称为概化理论（G 理论）的统计框架来估计变异源（或方面），并估计 OSCE 绩效分数的可靠性（即可重复性）。可靠性提供了学习者通过评估表现的一致性或可重复性的度量。

结果

在 115 名符合条件的高级住院医师中，有 99 名参加了 OSCE，因为其他住院医师无法参加。总体而言，住院医师正确执行了 4 个 OSCE 中 36 个总清单项目中的 84%（标准差 [SD] 16%，范围 38%-100%）。在全球评分中，知情同意站的通过率为 71%，治疗选择站的通过率为 97%，超声心动图解读站的通过率为 66%，超声应用站的通过率为 72%。表示考生排名可重复性的可靠性估计值等于 0.56（95%置信区间 [CI]，0.49-0.63），这对于旨在比较住院医师相对于其他住院医师的表现的规范性评估是合理的，因为总分中超过一半的观察到的变异是由于考生能力的差异造成的。0.42（95%CI，0.35-0.50）的Phi 系数可靠性表明不能基于标准进行判断（例如，通过/不通过状态）。Phi 表示分数的绝对一致性，并反映评估再现考生最终分数的紧密程度。总体而言，最大的（14.6%）方差归因于人与项目与站的相互作用（3 向相互作用），表明特定住院医师在某些项目上表现良好，但在其他项目上表现不佳。由于病例项目在住院医师培训计划之间存在差异（11.2%），这表明在 OSCE 期间，住院医师培训计划之间的表现存在中等程度的变异性。

结论

由于许多住院医师培训计划旨在开发自己的模拟 OSCE，因此本研究表明，培训计划可以创造有意义的模拟 OSCE 体验，这对于区分住院医师的表现具有统计学可靠性。

Assessment Scores of a Mock Objective Structured Clinical Examination Administered to 99 Anesthesiology Residents at 8 Institutions.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献