Huber Philippe, Baroffio Anne, Chamot Eric, Herrmann François, Nendaz Mathieu R, Vu Nu V
Unit of Development and Research in Medical Education, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
Med Educ. 2005 Aug;39(8):852-8. doi: 10.1111/j.1365-2929.2005.02226.x.
Examinations based on using standardised patients (SPs) commonly use checklist recordings to evaluate students' clinical performance. This paper examines whether and to what extent item and rater characteristics affect the reliability of history checklist recording in an SP-based assessment.
Checklist items were reviewed for the presence or absence of 5 item characteristics and a 2-point versus 3-point scoring scale. Agreement between checklist recordings obtained from SPs and clinician-examiners (CEs) were compared by item characteristics, scoring scale and CEs' level of involvement in the assessment.
Based on 3179 pairs of recordings, the overall percentage of agreement between SPs and CEs was 83% (kappa = 0.64). Agreement was significantly higher for items scored on a 2-point than on a 3-point scale, and when the CE was also the author and the trainer of the station. After controlling for other factors, item characteristics were only marginally associated with level of interrater agreement.
This study suggests that attention should be paid to specific aspects of checklist development and checklist recording training when an SP or CE is used as recorder.
基于标准化病人(SP)的考试通常使用检查表记录来评估学生的临床能力。本文探讨在基于标准化病人的评估中,条目特征和评分者特征是否以及在多大程度上会影响病史检查表记录的可靠性。
审查检查表条目是否具备5种条目特征以及采用的是2分制还是3分制评分量表。根据条目特征、评分量表以及临床考官(CE)在评估中的参与程度,比较由标准化病人和临床考官获得的检查表记录之间的一致性。
基于3179对记录,标准化病人与临床考官之间的总体一致率为83%(kappa = 0.64)。采用2分制评分的条目一致性显著高于采用3分制评分的条目,并且当临床考官同时也是考站的出题人和培训者时,一致性更高。在控制其他因素后,条目特征与评分者间一致性水平仅存在微弱关联。
本研究表明,当使用标准化病人或临床考官作为记录者时,应关注检查表编制的具体方面以及检查表记录培训。