McGuire Natalie, Acai Anita, Sonnadara Ranil R
Office of Professional Development and Educational Scholarship, Queen's University, Kingston, Ontario, Canada.
Department of Psychiatry and Behavioural Neurosciences and McMaster Education Research, Innovation and Theory (MERIT) Program, McMaster University, and St. Joseph's Education Research Centre (SERC), St. Joseph's Healthcare Hamilton, Hamilton, Canada.
Teach Learn Med. 2025 Jan-Mar;37(1):86-98. doi: 10.1080/10401334.2023.2276799. Epub 2023 Nov 15.
The McMaster Narrative Comment Rating Tool aims to capture critical features reflecting the quality of written narrative comments provided in the medical education context: valence/tone of language, degree of correction versus reinforcement, specificity, actionability, and overall usefulness.
Despite their role in competency-based medical education, not all narrative comments contribute meaningfully to the development of learners' competence. To develop solutions to mitigate this problem, robust measures of narrative comment quality are needed. While some tools exist, most were created in specialty-specific contexts, have focused on one or two features of feedback, or have focused on faculty perceptions of feedback, excluding learners from the validation process. In this study, we aimed to develop a detailed, broadly applicable narrative comment quality assessment tool that drew upon features of high-quality assessment and feedback and could be used by a variety of raters to inform future research, including applications related to automated analysis of narrative comment quality.
In Phase 1, we used the literature to identify five critical features of feedback. We then developed rating scales for each of the features, and collected 670 competency-based assessments completed by first-year surgical residents in the first six-weeks of training. Residents were from nine different programs at a Canadian institution. In Phase 2, we randomly selected 50 assessments with written feedback from the dataset. Two education researchers used the scale to independently score the written comments and refine the rating tool. In Phase 3, 10 raters, including two medical education researchers, two medical students, two residents, two clinical faculty members, and two laypersons from the community, used the tool to independently and blindly rate written comments from another 50 randomly selected assessments from the dataset. We compared scores between and across rater pairs to assess reliability.
Single and average measures intraclass correlation (ICC) scores ranged from moderate to excellent (ICCs = .51-.83 and .91-.98) across all categories and rater pairs. All tool domains were significantly correlated (' <.05), apart from valence, which was only significantly correlated with degree of correction versus reinforcement.
Our findings suggest that the McMaster Narrative Comment Rating Tool can reliably be used by multiple raters, across a variety of rater types, and in different surgical contexts. As such, it has the potential to support faculty development initiatives on assessment and feedback, and may be used as a tool to conduct research on different assessment strategies, including automated analysis of narrative comments.
麦克马斯特叙事评论评级工具旨在捕捉反映医学教育背景下书面叙事评论质量的关键特征:语言的效价/语气、纠正与强化的程度、具体性、可操作性以及整体实用性。
尽管叙事评论在基于能力的医学教育中发挥着作用,但并非所有叙事评论都能对学习者能力的发展做出有意义的贡献。为了开发解决方案来缓解这一问题,需要强大的叙事评论质量衡量标准。虽然存在一些工具,但大多数是在特定专业背景下创建的,专注于反馈的一两个特征,或者专注于教师对反馈的看法,将学习者排除在验证过程之外。在本研究中,我们旨在开发一种详细的、广泛适用的叙事评论质量评估工具,该工具借鉴高质量评估和反馈的特征,可供各种评分者使用,以为未来的研究提供信息,包括与叙事评论质量自动分析相关的应用。
在第一阶段,我们利用文献确定反馈的五个关键特征。然后,我们为每个特征制定了评级量表,并收集了670份由加拿大一所机构的一年级外科住院医师在培训的前六周完成的基于能力的评估。住院医师来自九个不同的项目。在第二阶段,我们从数据集中随机选择了50份带有书面反馈的评估。两名教育研究人员使用该量表对书面评论进行独立评分,并完善评级工具。在第三阶段,10名评分者,包括两名医学教育研究人员、两名医学生、两名住院医师、两名临床教员和两名社区外行人,使用该工具对数据集中另外50份随机选择的评估的书面评论进行独立和盲评。我们比较了评分者对之间和 across rater pairs 的分数以评估可靠性。
在所有类别和评分者对中,单因素和平均组内相关系数(ICC)分数从中度到优秀(ICC =.51-.83 和.91-.98)不等。除了效价,所有工具领域都显著相关('<.05),效价仅与纠正与强化的程度显著相关。
我们的研究结果表明,麦克马斯特叙事评论评级工具可以被多种评分者可靠地使用,适用于各种评分者类型,并且在不同的外科背景下都适用。因此,它有可能支持关于评估和反馈的教师发展倡议,并可作为一种工具来开展关于不同评估策略的研究,包括叙事评论的自动分析。