Faculty of Health Sciences, IDIBO Research Group, Rey Juan Carlos University, Madrid, Spain.
BMC Med Educ. 2023 Mar 30;23(1):197. doi: 10.1186/s12909-023-04187-3.
Students´ assessment should be carried out in an effective and objective manner, which reduces the possibility of different evaluators giving different scores, thus influencing the qualification obtained and the consistency of education. The aim of the present study was to determine the agreement among four evaluators and compare the overall scores awarded when assessing portfolios of endodontic preclinical treatments performed by dental students by using an analytic rubric and a numeric rating scale.
A random sample of 42 portfolios performed by fourth-year dental students at preclinical endodontic practices were blindly assessed by four evaluators using two different evaluation methods: an analytic rubric specifically designed and a numeric rating scale. Six categories were analyzed: radiographic assessment, access preparation, shaping procedure, obturation, content of the portfolio, and presentation of the portfolio. The maximum global score was 10 points. The overall scores obtained with both methods from each evaluator were compared by Student's t, while agreement among evaluators was measured by Intraclass correlation coefficients (ICC). The influence of the difficulty of the endodontic treatment on the evaluators´ scores was analyzed by one-way ANOVA. Statistical tests were performed at a pre-set alpha of 0.05 using Stata 16.
Difficulty of canal treatment did not influence the scores of evaluators, irrespective of the evaluation method used. When the analytic rubric was used, inter-evaluator agreement was substantial for radiographic assessment, access preparation, shaping procedure, obturation, and overall scores. Inter-evaluator agreement ranged from moderate to fair with the numeric rating scale. Mean higher overall scores were achieved when numeric rating scale was used. Presentation and content of the portfolio showed slight and fair agreement, respectively, among evaluators, regardless the evaluation method applied.
Assessment guided by an analytic rubric allowed evaluators to reach higher levels of agreement than those obtained when using a numeric rating scale. However, the rubric negatively affected overall scores.
学生的评估应该以有效和客观的方式进行,以减少不同评估者给出不同分数的可能性,从而影响获得的资格和教育的一致性。本研究的目的是确定四位评估者之间的一致性,并比较使用分析量表和数字评分量表评估牙科学生进行的牙髓前临床治疗的组合包时给予的总体分数。
从临床前牙髓实践中随机抽取四年级牙科学生的 42 个组合包,由四位评估者使用两种不同的评估方法进行盲评:专门设计的分析量表和数字评分量表。分析了六个类别:放射评估、备洞准备、成型程序、填充、组合包的内容和组合包的呈现。总分为 10 分。每位评估者使用两种方法获得的总体分数通过学生 t 检验进行比较,而评估者之间的一致性通过组内相关系数(ICC)进行测量。通过单因素方差分析分析牙髓治疗的难度对评估者分数的影响。统计检验使用 Stata 16 在预设的 0.05 水平α进行。
根管治疗的难度不影响评估者的分数,无论使用哪种评估方法。使用分析量表时,放射评估、备洞准备、成型程序、填充和总体分数的评估者之间具有实质性的一致性。数字评分量表的评估者之间的一致性范围从中等到适度。使用数字评分量表时,总体分数更高。组合包的呈现和内容在评估者之间分别表现出轻微和适度的一致性,无论应用哪种评估方法。
分析量表指导的评估比使用数字评分量表获得的评估者之间的一致性更高。然而,量表对总体分数产生了负面影响。