Suppr超能文献

高风险临床考试中的边缘等级:解决考官的不确定性。

Borderline grades in high stakes clinical examinations: resolving examiner uncertainty.

机构信息

Faculty of Medicine, University of New South Wales, Sydney, Australia.

出版信息

BMC Med Educ. 2018 Nov 20;18(1):272. doi: 10.1186/s12909-018-1382-0.

Abstract

BACKGROUND

Objective Structured Clinical Exams are used to increase reliability and validity, yet they only achieve a modest level of reliability. This low reliability is due in part to examiner variance which is greater than the variance of students. This variance often represents indecisiveness at the cut score with apparent confusion over terms such as "borderline pass". It is amplified by a well reported failure to fail.

METHODS

A borderline grade (meaning performance is neither a clear pass nor a clear fail) was introduced in a high stakes undergraduate medical clinical skills exam to replace a borderline pass grade (which was historically resolved as 50%) in a 4 point scale (distinction, pass, borderline, fail). Each Borderline grade was then resolved into a Pass or Fail grade by a formula referencing the difficulty of the station and the performance in the same domain by the student in other stations. Raw pass or fail grades were unaltered. Mean scores and 95%CI were calculated per station and per domain for the unmodified and the modified scores/grades (results are presented on error bars). To estimate the defensibility of these modifications, similar analysis took place for the P and the F grades which resulted from the modification of the B grades.

RESULTS

Of 14,634 observations 4.69% were Borderline. Application of the formula did not impact the mean scores in each domain but the failure rate for the exam increased from 0.7 to 4.1%. Examiners and students expressed satisfaction with the Borderline grade, resolution formula and outcomes. Mean scores (by stations and by domains respectively) of students whose B grades were modified to P were significantly higher than their counterparts whose B grades were modified to F.

CONCLUSIONS

This study provides a feasible and defensible resolution to situations where the examinee's performance is neither a clear pass nor a clear fail, demonstrating the application of the resolution of borderline formula in a high stakes exam. It does not create a new performance standard but utilises real data to make judgements about these small number of candidates. This is perceived as a fair approach to Pass/Fail decisions.

摘要

背景

客观结构化临床考试被用来提高可靠性和有效性,但它们仅达到了适度的可靠性水平。这种低可靠性部分是由于考官的差异大于学生的差异。这种差异通常表现为在临界分数上犹豫不决,对“边缘及格”等术语感到困惑。由于未能通过考试的情况众所周知,这种差异被放大了。

方法

在一项高风险的本科医学临床技能考试中,引入了边缘等级(意味着表现既不是明显的及格也不是明显的不及格),以取代历史上在 4 分制(优秀、及格、边缘、不及格)中以 50%的比例解决的边缘及格等级。每个边缘等级都通过一个公式转换为及格或不及格等级,该公式参考了站点的难度和学生在其他站点中同一领域的表现。原始及格或不及格成绩保持不变。每个站点和每个领域的原始及格和修改后的成绩/等级都计算了平均值和 95%置信区间(结果以误差线表示)。为了评估这些修改的合理性,对修改后的 B 等级产生的 P 和 F 等级进行了类似的分析。

结果

在 14634 次观察中,4.69%是边缘等级。应用该公式并未影响每个领域的平均分数,但考试的不及格率从 0.7%增加到 4.1%。考官和学生对边缘等级、解析公式和结果表示满意。B 等级修改为 P 的学生的平均分数(分别按站点和按领域)明显高于修改为 F 的学生。

结论

这项研究为解决考生表现既不是明显的及格也不是明显的不及格的情况提供了一种可行且合理的解决方案,展示了在高风险考试中应用边缘解析公式的应用。它没有创造新的绩效标准,而是利用真实数据对这些少数候选人做出判断。这被认为是通过/失败决策的公平方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0759/6247637/d6708bdc6437/12909_2018_1382_Fig1_HTML.jpg

相似文献

1
Borderline grades in high stakes clinical examinations: resolving examiner uncertainty.
BMC Med Educ. 2018 Nov 20;18(1):272. doi: 10.1186/s12909-018-1382-0.
2
Predictive validity of a tool to resolve borderline grades in OSCEs.
GMS J Med Educ. 2020 Apr 15;37(3):Doc31. doi: 10.3205/zma001324. eCollection 2020.
3
Enhancing the defensibility of examiners' marks in high stake OSCEs.
BMC Med Educ. 2018 Jan 6;18(1):10. doi: 10.1186/s12909-017-1112-z.
4
Sources of variation in performance on a shared OSCE station across four UK medical schools.
Med Educ. 2009 Jun;43(6):526-32. doi: 10.1111/j.1365-2923.2009.03370.x.
5
Pass/fail decisions and standards: the impact of differential examiner stringency on OSCE outcomes.
Adv Health Sci Educ Theory Pract. 2022 May;27(2):457-473. doi: 10.1007/s10459-022-10096-9. Epub 2022 Mar 1.
7
Developing a video-based method to compare and adjust examiner effects in fully nested OSCEs.
Med Educ. 2019 Mar;53(3):250-263. doi: 10.1111/medu.13783. Epub 2018 Dec 21.
8
Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.
Adv Health Sci Educ Theory Pract. 2024 Jul;29(3):919-934. doi: 10.1007/s10459-023-10289-w. Epub 2023 Oct 16.
9
Factor analysis can be a useful standard setting tool in a high stakes OSCE assessment.
Med Educ. 2004 Aug;38(8):825-31. doi: 10.1111/j.1365-2929.2004.01821.x.

引用本文的文献

1
Predictive validity of a tool to resolve borderline grades in OSCEs.
GMS J Med Educ. 2020 Apr 15;37(3):Doc31. doi: 10.3205/zma001324. eCollection 2020.

本文引用的文献

1
Angoff anchor statements: setting a flawed gold standard?
MedEdPublish (2016). 2017 Sep 21;6:167. doi: 10.15694/mep.2017.000167. eCollection 2017.
2
Enhancing the defensibility of examiners' marks in high stake OSCEs.
BMC Med Educ. 2018 Jan 6;18(1):10. doi: 10.1186/s12909-017-1112-z.
3
Managing extremes of assessor judgment within the OSCE.
Med Teach. 2017 Jan;39(1):58-66. doi: 10.1080/0142159X.2016.1230189. Epub 2016 Sep 27.
4
Introducing a model for optimal design of sequential objective structured clinical examinations.
Adv Health Sci Educ Theory Pract. 2016 Dec;21(5):1047-1060. doi: 10.1007/s10459-016-9673-x. Epub 2016 Mar 7.
5
Exploring examiner judgement of professional competence in rater based assessment.
Adv Health Sci Educ Theory Pract. 2016 Oct;21(4):775-88. doi: 10.1007/s10459-016-9665-x. Epub 2016 Jan 21.
7
Does a Rater's Professional Background Influence Communication Skills Assessment?
J Vet Med Educ. 2015 Winter;42(4):315-23. doi: 10.3138/jvme.0215-023R. Epub 2015 Aug 28.
9
Standard setting in OSCEs: a borderline approach.
Clin Teach. 2014 Dec;11(7):551-6. doi: 10.1111/tct.12213.
10
Validity evidence for medical school OSCEs: associations with USMLE® step assessments.
Teach Learn Med. 2014;26(4):379-86. doi: 10.1080/10401334.2014.960294.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验