“在同一页上”？GP 考官反馈对临床评估中评分严重程度差异的影响：一项干预前后研究。

"On the same page"? The effect of GP examiner feedback on differences in rating severity in clinical assessments: a pre/post intervention study.

机构信息

Primary Care Clinical Unit, Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia.

Royal Brisbane & Women's Hospitals, Level 8, Health Sciences Building, Herston, QLD, Australia.

出版信息

BMC Med Educ. 2017 Jun 6;17(1):101. doi: 10.1186/s12909-017-0929-9.

DOI:10.1186/s12909-017-0929-9

PMID:28587597

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5461633/

Abstract

BACKGROUND

Robust and defensible clinical assessments attempt to minimise differences in student grades which are due to differences in examiner severity (stringency and leniency). Unfortunately there is little evidence to date that examiner training and feedback interventions are effective; "physician raters" have indeed been deemed "impervious to feedback". Our aim was to investigate the effectiveness of a general practitioner examiner feedback intervention, and explore examiner attitudes to this.

METHODS

Sixteen examiners were provided with a written summary of all examiner ratings in medical student clinical case examinations over the preceding 18 months, enabling them to identify their own rating data and compare it with other examiners. Examiner ratings and examiner severity self-estimates were analysed pre and post intervention, using non-parametric bootstrapping, multivariable linear regression, intra-class correlation and Spearman's correlation analyses. Examiners completed a survey exploring their perceptions of the usefulness and acceptability of the intervention, including what (if anything) examiners planned to do differently as a result of the feedback.

RESULTS

Examiner severity self-estimates were relatively poorly correlated with measured severity on the two clinical case examination types pre-intervention (0.29 and 0.67) and were less accurate post-intervention. No significant effect of the intervention was identified, when differences in case difficulty were controlled for, although there were fewer outlier examiners post-intervention. Drift in examiner severity over time prior to the intervention was observed. Participants rated the intervention as interesting and useful, and survey comments indicated that fairness, reassurance, and understanding examiner colleagues are important to examiners.

CONCLUSIONS

Despite our participants being receptive to our feedback and wanting to be "on the same page", we did not demonstrate effective use of the feedback to change their rating behaviours. Calibration of severity appears to be difficult for examiners, and further research into better ways of providing more effective feedback is indicated.

摘要

背景

稳健且合理的临床评估旨在将学生成绩的差异最小化，这些差异归因于考试者的严格程度（严格和宽松）差异。不幸的是，迄今为止，几乎没有证据表明考试者培训和反馈干预措施是有效的；“医生评分者”确实被认为“对反馈无动于衷”。我们的目的是调查普通科医生考试者反馈干预措施的有效性，并探讨考试者对此的态度。

方法

为 16 名考试者提供了过去 18 个月中所有考试者对医学生临床病例考试评分的书面摘要，使他们能够识别自己的评分数据，并将其与其他考试者进行比较。在干预前后，使用非参数引导、多变量线性回归、组内相关和斯皮尔曼相关分析，分析考试者评分和考试者严重程度自我估计。考试者完成了一项调查，探讨他们对干预措施的有用性和可接受性的看法，包括（如果有的话）考试者计划因反馈而有所不同。

结果

在干预前，考试者严重程度自我估计与两种临床病例考试类型的测量严重程度相关性较差（0.29 和 0.67），且干预后准确性降低。在控制病例难度差异的情况下，干预没有产生显著效果，尽管干预后考试者的异常值较少。在干预前，考试者的严重程度随时间推移而发生漂移。参与者认为干预措施有趣且有用，调查评论表明公平、放心和了解考试者同事对考试者很重要。