Moniz Tracy, Arntfield Shannon, Miller Kristina, Lingard Lorelei, Watling Chris, Regehr Glenn
Department of Communication Studies, Mount Saint Vincent University, Halifax, Nova Scotia, Canada.
Department of Obstetrics and Gynaecology, Western University, London, Ontario, Canada.
Med Educ. 2015 Sep;49(9):901-8. doi: 10.1111/medu.12771.
Reflective writing is a popular tool to support the growth of reflective capacity in undergraduate medical learners. Its popularity stems from research suggesting that reflective capacity may lead to improvements in skills such as empathy, communication, collaboration and professionalism. This has led to assumptions that reflective writing can also serve as a tool for student assessment. However, evidence to support the reliability and validity of reflective writing as a meaningful assessment strategy is lacking.
Using a published instrument for measuring 'reflective capacity' (the Reflection Evaluation for Learners' Enhanced Competencies Tool [REFLECT]), four trained raters independently scored four samples of writing from each of 107 undergraduate medical students to determine the reliability of reflective writing scores. REFLECT scores were then correlated with scores on a Year 4 objective structured clinical examination (OSCE) and Year 2 multiple-choice question (MCQ) examinations to examine, respectively, convergent and divergent validity.
Across four writing samples, four-rater Cronbach's α-values ranged from 0.72 to 0.82, demonstrating reasonable inter-rater reliability with four raters using the REFLECT rubric. However, inter-sample reliability was fairly low (four-sample Cronbach's α = 0.54, single-sample intraclass correlation coefficient: 0.23), which suggests that performance on one reflective writing sample was not strongly indicative of performance on the next. Approximately 14 writing samples are required to achieve reasonable inter-sample reliability. The study found weak, non-significant correlations between reflective writing scores and both OSCE global scores (r = 0.13) and MCQ examination scores (r = 0.10), demonstrating a lack of relationship between reflective writing and these measures of performance.
Our findings suggest that to draw meaningful conclusions about reflective capacity as a stable construct in individuals requires 14 writing samples per student, each assessed by four or five raters. This calls into question the feasibility and utility of using reflective writing rigorously as an assessment tool in undergraduate medical education.
反思性写作是一种用于支持本科医学学习者反思能力发展的常用工具。其受欢迎的原因在于研究表明反思能力可能会带来诸如同理心、沟通、协作和专业素养等技能的提升。这导致了一种假设,即反思性写作也可以作为学生评估的一种工具。然而,缺乏支持反思性写作作为一种有意义的评估策略的可靠性和有效性的证据。
使用一种已发表的用于测量“反思能力”的工具(学习者增强能力反思评估工具[REFLECT]),四名经过培训的评分者对107名本科医学生每人的四份写作样本进行独立评分,以确定反思性写作评分的可靠性。然后将REFLECT分数与四年级客观结构化临床考试(OSCE)和二年级多项选择题(MCQ)考试的分数进行相关性分析,分别检验聚合效度和区分效度。
在四个写作样本中,四名评分者的克朗巴赫α值范围为0.72至0.82,表明使用REFLECT评分标准时,四名评分者之间具有合理的评分者间信度。然而,样本间信度相当低(四个样本的克朗巴赫α = 0.54,单样本组内相关系数:0.23),这表明在一个反思性写作样本上的表现并不能强烈预示在下一个样本上的表现。大约需要14个写作样本才能实现合理的样本间信度。该研究发现反思性写作分数与OSCE整体分数(r = 0.13)和MCQ考试分数(r = 0.10)之间的相关性较弱且不显著,表明反思性写作与这些表现指标之间缺乏关联。
我们的研究结果表明,要对个体中作为一个稳定结构的反思能力得出有意义的结论,每名学生需要14个写作样本,每个样本由四或五名评分者进行评估。这对在本科医学教育中严格将反思性写作用作评估工具的可行性和实用性提出了质疑。