Byrd Jennifer S, Peeters Michael J
Union University College of Pharmacy.
University of Toledo College of Pharmacy & Pharmaceutical Sciences.
Innov Pharm. 2021 Feb 26;12(1). doi: 10.24926/iip.v12i1.2136. eCollection 2021.
There is a paucity of validation evidence for assessing clinical case-presentations by Doctor of Pharmacy (PharmD) students. Within Kane's Framework for Validation, evidence for inferences of scoring and generalization should be generated first. Thus, our objectives were to characterize and improve scoring, as well as build initial generalization evidence, in order to provide validation evidence for performance-based assessment of clinical case-presentations.
Third-year PharmD students worked up patient-cases from a local hospital. Students orally presented and defended their therapeutic care-plan to pharmacist preceptors (evaluators) and fellow students. Evaluators scored each presentation using an 11-item instrument with a 6-point rating-scale. In addition, evaluators scored a global-item with a 4-point rating-scale. Rasch Measurement was used for scoring analysis, while Generalizability Theory was used for generalization analysis.
Thirty students each presented five cases that were evaluated by 15 preceptors using an 11-item instrument. Using Rasch Measurement, the 11-item instrument's 6-point rating-scale did not work; it only worked once collapsed to a 4-point rating-scale. This revised 11-item instrument also showed redundancy. Alternatively, the global-item performed reasonably on its own. Using multivariate Generalizability Theory, the g-coefficient (reliability) for the series of five case-presentations was 0.76 with the 11-item instrument, and 0.78 with the global-item. Reliability was largely dependent on multiple case-presentations and, to a lesser extent, the number of evaluators per case-presentation.
Our pilot results confirm that scoring should be simple (scale and instrument). More specifically, the longer 11-item instrument measured but had redundancy, whereas the single global-item provided measurement over multiple case-presentations. Further, acceptable reliability can be balanced between more/fewer case-presentations and using more/fewer evaluators.
药学博士(PharmD)学生在评估临床病例报告方面缺乏验证证据。在凯恩的验证框架内,应首先生成评分和概括性推断的证据。因此,我们的目标是对评分进行特征描述和改进,并建立初步的概括性证据,以便为基于表现的临床病例报告评估提供验证证据。
三年级药学博士学生研究了一家当地医院的患者病例。学生们向药剂师带教老师(评估人员)和同学口头介绍并辩护他们的治疗护理计划。评估人员使用一个包含11个项目、6级评分量表的工具对每次报告进行评分。此外,评估人员使用一个4级评分量表对一个整体项目进行评分。采用拉施测量法进行评分分析,采用概化理论进行概括性分析。
30名学生每人展示了5个病例,由15名带教老师使用一个包含11个项目的工具进行评估。使用拉施测量法时,11个项目的6级评分量表不起作用;只有在合并为4级评分量表时才有效。这个修订后的11个项目的工具也显示出冗余。另外,整体项目本身表现合理。使用多变量概化理论,对于一系列5次病例报告,使用11个项目的工具时g系数(信度)为0.76,使用整体项目时为0.78。信度在很大程度上取决于多个病例报告,在较小程度上取决于每个病例报告的评估人员数量。
我们的试点结果证实,评分应该简单(量表和工具)。更具体地说,较长的11个项目的工具进行了测量但存在冗余,而单个整体项目在多个病例报告中提供了测量。此外,在更多/更少的病例报告和使用更多/更少的评估人员之间可以平衡获得可接受的信度。