朝向 OSCE 中更细微的差异化考站严格程度概念化。

Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.

机构信息

School of Medicine, University of Leeds, Leeds, LS2 JT, UK.

出版信息

Adv Health Sci Educ Theory Pract. 2024 Jul;29(3):919-934. doi: 10.1007/s10459-023-10289-w. Epub 2023 Oct 16.

DOI:10.1007/s10459-023-10289-w

PMID:37843678

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11208245/

Abstract

Quantitative measures of systematic differences in OSCE scoring across examiners (often termed examiner stringency) can threaten the validity of examination outcomes. Such effects are usually conceptualised and operationalised based solely on checklist/domain scores in a station, and global grades are not often used in this type of analysis. In this work, a large candidate-level exam dataset is analysed to develop a more sophisticated understanding of examiner stringency. Station scores are modelled based on global grades-with each candidate, station and examiner allowed to vary in their ability/stringency/difficulty in the modelling. In addition, examiners are also allowed to vary in how they discriminate across grades-to our knowledge, this is the first time this has been investigated. Results show that examiners contribute strongly to variance in scoring in two distinct ways-via the traditional conception of score stringency (34% of score variance), but also in how they discriminate in scoring across grades (7%). As one might expect, candidate and station account only for a small amount of score variance at the station-level once candidate grades are accounted for (3% and 2% respectively) with the remainder being residual (54%). Investigation of impacts on station-level candidate pass/fail decisions suggest that examiner differential stringency effects combine to give false positive (candidates passing in error) and false negative (failing in error) rates in stations of around 5% each but at the exam-level this reduces to 0.4% and 3.3% respectively. This work adds to our understanding of examiner behaviour by demonstrating that examiners can vary in qualitatively different ways in their judgments. For institutions, it emphasises the key message that it is important to sample widely from the examiner pool via sufficient stations to ensure OSCE-level decisions are sufficiently defensible. It also suggests that examiner training should include discussion of global grading, and the combined effect of scoring and grading on candidate outcomes.

摘要

定量衡量 OSCE 评分中考核者之间的系统差异（通常称为考核者严格程度）可能会威胁考试结果的有效性。此类影响通常仅基于站中的检查表/域评分来概念化和操作化，并且在这种类型的分析中通常不使用总体成绩。在这项工作中，分析了大量候选人级别的考试数据集，以更深入地了解考核者的严格程度。根据总体成绩对站成绩进行建模-每个候选人、站和考核者都可以在建模中改变其能力/严格程度/难度。此外，还允许考核者在成绩之间进行区分-据我们所知，这是首次对此进行研究。结果表明，考核者以两种截然不同的方式对评分差异做出了很大贡献-通过传统的评分严格程度概念（评分差异的 34％），但也通过他们在成绩之间的区分方式（7％）。正如人们可能预期的那样，一旦考虑到候选人成绩，候选人成绩仅占站级评分差异的很小一部分（分别为 3％和 2％），其余部分为剩余部分（54％）。对站级候选人及格/不及格决策的影响的调查表明，考核者的差异性严格程度效应结合在一起，导致错误地通过（错误通过的候选人）和错误地失败（错误失败的候选人）的比例约为每个站 5％，但在考试级别，这分别降低到 0.4％和 3.3％。这项工作通过证明考核者可以在其判断中以不同的方式进行定性差异，从而增加了对考核者行为的理解。对于机构而言，它强调了一个关键信息，即通过足够的站点从考核者群体中广泛抽样，以确保 OSCE 级别的决策具有足够的说服力非常重要。它还表明，考核者培训应包括讨论总体评分以及评分和评分对候选人结果的综合影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/675a/11208245/22dcffc76966/10459_2023_10289_Fig1_HTML.jpg

相似文献

Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.

Adv Health Sci Educ Theory Pract. 2024 Jul;29(3):919-934. doi: 10.1007/s10459-023-10289-w. Epub 2023 Oct 16.

Pass/fail decisions and standards: the impact of differential examiner stringency on OSCE outcomes.

Adv Health Sci Educ Theory Pract. 2022 May;27(2):457-473. doi: 10.1007/s10459-022-10096-9. Epub 2022 Mar 1.

Re-conceptualising and accounting for examiner (cut-score) stringency in a 'high frequency, small cohort' performance test.

Adv Health Sci Educ Theory Pract. 2021 May;26(2):369-383. doi: 10.1007/s10459-020-09990-x. Epub 2020 Sep 2.

Assessment of examiner leniency and stringency ('hawk-dove effect') in the MRCP(UK) clinical examination (PACES) using multi-facet Rasch modelling.

BMC Med Educ. 2006 Aug 18;6:42. doi: 10.1186/1472-6920-6-42.

Developing a video-based method to compare and adjust examiner effects in fully nested OSCEs.

Med Educ. 2019 Mar;53(3):250-263. doi: 10.1111/medu.13783. Epub 2018 Dec 21.

Borderline grades in high stakes clinical examinations: resolving examiner uncertainty.

BMC Med Educ. 2018 Nov 20;18(1):272. doi: 10.1186/s12909-018-1382-0.

Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs.

Adv Health Sci Educ Theory Pract. 2008 Dec;13(5):617-32. doi: 10.1007/s10459-007-9068-0. Epub 2007 Jul 3.

Exploration of a possible relationship between examiner stringency and personality factors in clinical assessments: a pilot study.

BMC Med Educ. 2014 Dec 31;14:1052. doi: 10.1186/s12909-014-0280-3.

Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE).

Acad Med. 2021 Aug 1;96(8):1189-1196. doi: 10.1097/ACM.0000000000004028. Epub 2021 Mar 2.

Does Changing Examiner Stations During UK Postgraduate Surgery Objective Structured Clinical Examinations Influence Examination Reliability and Candidates' Scores?

J Surg Educ. 2016 Jul-Aug;73(4):616-23. doi: 10.1016/j.jsurg.2016.01.010. Epub 2016 Feb 26.

引用本文的文献

Measuring and correcting staff variability in large-scale OSCEs.

BMC Med Educ. 2024 Jul 29;24(1):817. doi: 10.1186/s12909-024-05803-6.

Tailoring support following summative assessments: a latent profile analysis of student outcomes across five medical specialities.

Adv Health Sci Educ Theory Pract. 2025 Apr;30(2):459-473. doi: 10.1007/s10459-024-10357-9. Epub 2024 Jul 23.

The use of objective structured clinical examination in dental education- a narrative review.

Front Oral Health. 2024 Feb 2;5:1336677. doi: 10.3389/froh.2024.1336677. eCollection 2024.

本文引用的文献

Setting defensible minimum-stations-passed standards in OSCE-type assessments.

Med Teach. 2023 Oct;45(10):1163-1169. doi: 10.1080/0142159X.2023.2197138. Epub 2023 Apr 8.

Using cultural historical activity theory to reflect on the sociocultural complexities in OSCE examiners' judgements.

Adv Health Sci Educ Theory Pract. 2023 Mar;28(1):27-46. doi: 10.1007/s10459-022-10139-1. Epub 2022 Aug 9.

Pass/fail decisions and standards: the impact of differential examiner stringency on OSCE outcomes.

Adv Health Sci Educ Theory Pract. 2022 May;27(2):457-473. doi: 10.1007/s10459-022-10096-9. Epub 2022 Mar 1.

The pursuit of fairness in assessment: Looking beyond the objective.

Med Teach. 2022 Apr;44(4):353-359. doi: 10.1080/0142159X.2022.2031943. Epub 2022 Feb 1.

Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE).

Acad Med. 2021 Aug 1;96(8):1189-1196. doi: 10.1097/ACM.0000000000004028. Epub 2021 Mar 2.

Re-conceptualising and accounting for examiner (cut-score) stringency in a 'high frequency, small cohort' performance test.

Adv Health Sci Educ Theory Pract. 2021 May;26(2):369-383. doi: 10.1007/s10459-020-09990-x. Epub 2020 Sep 2.

Understanding and developing procedures for video-based assessment in medical education.

Med Teach. 2020 Nov;42(11):1250-1260. doi: 10.1080/0142159X.2020.1801997. Epub 2020 Aug 4.

In defence of constructivist, utility-driven psychometrics for the 'post-psychometric era'.

Med Educ. 2020 Feb;54(2):99-102. doi: 10.1111/medu.14039. Epub 2019 Dec 22.

Using simulation studies to evaluate statistical methods.

Stat Med. 2019 May 20;38(11):2074-2102. doi: 10.1002/sim.8086. Epub 2019 Jan 16.

Developing a video-based method to compare and adjust examiner effects in fully nested OSCEs.

Med Educ. 2019 Mar;53(3):250-263. doi: 10.1111/medu.13783. Epub 2018 Dec 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

朝向 OSCE 中更细微的差异化考站严格程度概念化。

Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.

机构信息

School of Medicine, University of Leeds, Leeds, LS2 JT, UK.

出版信息

Adv Health Sci Educ Theory Pract. 2024 Jul;29(3):919-934. doi: 10.1007/s10459-023-10289-w. Epub 2023 Oct 16.

DOI:10.1007/s10459-023-10289-w

PMID:37843678

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11208245/

Abstract

摘要

朝向 OSCE 中更细微的差异化考站严格程度概念化。

Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

朝向 OSCE 中更细微的差异化考站严格程度概念化。

Towards a more nuanced conceptualisation of differential examiner stringency in OSCEs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献