Reddy Shalini T, Tekian Ara, Durning Steven J, Gupta Shanu, Endo Justin, Affinati Brenda, Park Yoon Soo
J Grad Med Educ. 2018 Jun;10(3):269-275. doi: 10.4300/JGME-D-17-00435.1.
Minimally anchored Standard Rating Scales (SRSs), which are widely used in medical education, are hampered by suboptimal interrater reliability. Expert-derived frameworks, such as the Accreditation Council for Graduate Medical Education (ACGME) Milestones, may be helpful in defining level-specific anchors to use on rating scales.
We examined validity evidence for a Milestones-Based Rating Scale (MBRS) for scoring chart-stimulated recall (CSR).
Two 11-item scoring forms with either an MBRS or SRS were developed. Items and anchors for the MBRS were adapted from the ACGME Internal Medicine Milestones. Six CSR standardized videos were developed. Clinical faculty scored videos using either the MBRS or SRS and following a randomized crossover design. Reliability of the MBRS versus the SRS was compared using intraclass correlation.
Twenty-two faculty were recruited for instrument testing. Some participants did not complete scoring, leaving a response rate of 15 faculty (7 in the MBRS group and 8 in the SRS group). A total of 529 ratings (number of items × number of scores) using SRSs and 540 using MBRSs were available. Percent agreement was higher for MBRSs for only 2 of 11 items-use of consultants (92 versus 75, = .019) and unique characteristics of patients (96 versus 79, = .011)-and the overall score (89 versus 82, < .001). Interrater agreement was 0.61 for MBRSs and 0.51 for SRSs.
Adding milestones to our rating form resulted in significant, but not substantial, improvement in intraclass correlation coefficient. Improvement was inconsistent across items.
在医学教育中广泛使用的最低限度锚定标准评定量表(SRS)受到评分者间信度欠佳的阻碍。由专家制定的框架,如毕业后医学教育认证委员会(ACGME)的里程碑,可能有助于定义在评定量表上使用的特定水平锚定。
我们检验了基于里程碑的评定量表(MBRS)用于评分图表刺激回忆(CSR)的效度证据。
开发了两种包含11个条目的评分表,一种采用MBRS,另一种采用SRS。MBRS的条目和锚定取自ACGME内科里程碑。制作了6个CSR标准化视频。临床教员按照随机交叉设计,使用MBRS或SRS对视频进行评分。使用组内相关系数比较MBRS和SRS的信度。
招募了22名教员进行工具测试。一些参与者未完成评分,最终回复率为15名教员(MBRS组7名,SRS组8名)。使用SRS的评分共有529个(条目数×评分次数),使用MBRS的评分共有540个。仅在11个条目中的2个条目上,MBRS的百分比一致性更高,即咨询使用情况(92对75,P = 0.019)和患者的独特特征(96对79,P = 0.011),以及总分(89对82,P < 0.001)。MBRS的评分者间一致性为0.61,SRS为0.51。
在我们的评分表中添加里程碑,导致组内相关系数有显著但不显著的改善。各条目间的改善不一致。