Reubenson Alan, Schnepf Tanis, Waller Robert, Edmondston Stephen
School of Physiotherapy and Curtin Health Innovation Research Institute, Curtin University, Perth, Australia.
Clin Teach. 2012 Apr;9(2):119-22. doi: 10.1111/j.1743-498X.2011.00509.x.
The reliability of assessment is an important issue in the evaluation of competence in medical and allied health practice, particularly when assessments are conducted by multiple examiners. The purpose of this study was to examine the agreement between multiple examiners in the assessment of a postgraduate physiotherapy student using a specifically designed performance evaluation system.
Seven examiners simultaneously watched a recording of a postgraduate student's examination and treatment of one patient. The Postgraduate Physiotherapy Performance Assessment (PPPA) form was used to guide the assessment of performance in key areas of patient examination and management. Each examiner independently recorded a grade for each of five performance categories, and these scores were used to guide the global performance grade and mark.
Five examiners agreed on the global performance grade and four of the performance categories. The level of pass grade awarded was more variable, with scores in the performance categories spanning two grades, and in one case, three grades. The two examiners who were not in agreement with the majority consistently awarded higher grades across most performance categories.
This preliminary study has demonstrated majority agreement in global performance between multiple examiners when physiotherapy clinical practice is assessed against specific performance standards. Not all examiners awarded global grades consistent with the majority, and there was greater variability between examiners when grading performance in specific aspects of practice. These findings highlight the importance of examiner training and review sessions to improve inter-examiner agreement in assessments of clinical performance that require multiple examiners.
在医学及相关健康实践能力评估中,评估的可靠性是一个重要问题,尤其是当评估由多名考官进行时。本研究的目的是使用专门设计的绩效评估系统,检验多名考官在评估一名研究生物理治疗学生时的一致性。
七名考官同时观看一名研究生对一名患者进行检查和治疗的录像。使用研究生物理治疗绩效评估(PPPA)表来指导对患者检查和管理关键领域的绩效评估。每位考官独立记录五个绩效类别中每个类别的成绩,这些分数用于指导整体绩效等级和评分。
五名考官在整体绩效等级和四个绩效类别上达成一致。所授予的及格等级水平差异更大,绩效类别中的分数跨越两个等级,在一个案例中跨越三个等级。两名与大多数人意见不一致的考官在大多数绩效类别中始终给予更高的分数。
这项初步研究表明,当根据特定绩效标准评估物理治疗临床实践时,多名考官在整体绩效上达成了多数一致。并非所有考官给出的整体等级都与多数人一致,并且在对实践的特定方面进行评分时,考官之间的差异更大。这些发现凸显了考官培训和评审会议对于提高在需要多名考官的临床绩效评估中考官间一致性的重要性。