Weitz Gunther, Vinzentius Christian, Twesten Christoph, Lehnert Hendrik, Bonnemeier Hendrik, König Inke R
Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Medizinische Klinik I, Lübeck, Deutschland.
Institut für Qualitätsentwicklung an Schulen Schleswig-Holstein, Kronshagen, Deutschland.
GMS Z Med Ausbild. 2014 Nov 17;31(4):Doc41. doi: 10.3205/zma000933. eCollection 2014.
The accuracy and reproducibility of medical skills assessment is generally low. Rater training has little or no effect. Our knowledge in this field, however, relies on studies involving video ratings of overall clinical performances. We hypothesised that a rater training focussing on the frame of reference could improve accuracy in grading the curricular assessment of a highly standardised physical head-to-toe examination.
Twenty-one raters assessed the performance of 242 third-year medical students. Eleven raters had been randomly assigned to undergo a brief frame-of-reference training a few days before the assessment. 218 encounters were successfully recorded on video and re-assessed independently by three additional observers. Accuracy was defined as the concordance between the raters' grade and the median of the observers' grade. After the assessment, both students and raters filled in a questionnaire about their views on the assessment.
Rater training did not have a measurable influence on accuracy. However, trained raters rated significantly more stringently than untrained raters, and their overall stringency was closer to the stringency of the observers. The questionnaire indicated a higher awareness of the halo effect in the trained raters group. Although the self-assessment of the students mirrored the assessment of the raters in both groups, the students assessed by trained raters felt more discontent with their grade.
While training had some marginal effects, it failed to have an impact on the individual accuracy. These results in real-life encounters are consistent with previous studies on rater training using video assessments of clinical performances. The high degree of standardisation in this study was not suitable to harmonize the trained raters' grading. The data support the notion that the process of appraising medical performance is highly individual. A frame-of-reference training as applied does not effectively adjust the physicians' judgement on medical students in real-live assessments.
医学技能评估的准确性和可重复性普遍较低。评分者培训效果甚微或没有效果。然而,我们在这一领域的认知依赖于涉及对整体临床操作进行视频评分的研究。我们假设,聚焦于参照框架的评分者培训能够提高对高度标准化的从头到脚体格检查课程评估进行评分的准确性。
21名评分者对242名三年级医学生的操作进行评估。11名评分者在评估前几天被随机分配接受简短的参照框架培训。218次评估过程被成功录制在视频中,并由另外三名观察者独立重新评估。准确性定义为评分者给出的分数与观察者分数中位数之间的一致性。评估后,学生和评分者都填写了一份关于他们对评估看法的问卷。
评分者培训对准确性没有可衡量的影响。然而,经过培训的评分者比未经过培训的评分者评分更为严格,并且他们的总体严格程度更接近观察者的严格程度。问卷显示,经过培训的评分者组对光环效应的认识更高。虽然两组学生的自我评估都反映了评分者的评估,但接受过培训的评分者评估的学生对自己的分数更不满意。
虽然培训有一些边际效应,但未能对个体准确性产生影响。这些在实际评估中的结果与之前关于使用临床操作视频评估进行评分者培训的研究一致。本研究中的高度标准化并不适合使经过培训的评分者的评分趋于一致。数据支持这样一种观点,即评估医学操作的过程具有高度个体性。所应用的参照框架培训并不能有效地在实际评估中调整医生对医学生的判断。