Medical Education Research Center, Kanazawa University, Kanazawa, Japan.
Department of Molecular Genetics, Kanazawa University, Kanazawa, Japan.
Adv Health Sci Educ Theory Pract. 2024 Jul;29(3):949-965. doi: 10.1007/s10459-023-10290-3. Epub 2023 Oct 18.
Objective structured clinical examination (OSCE) is widely used to assess medical students' clinical skills. Virtual OSCEs were used in place of in-person OSCEs during the COVID-19 pandemic; however, their reliability is yet to be robustly analyzed. By applying generalizability (G) theory, this study aimed to evaluate the reliability of a hybrid OSCE, which admixed in-person and online methods, and gain insights into improving OSCEs' reliability. During the 2020-2021 hybrid OSCEs, one examinee, one rater, and a vinyl mannequin for physical examination participated onsite, and a standardized simulated patient (SP) for medical interviewing and another rater joined online in one virtual breakout room on an audiovisual conferencing system. G-coefficients and 95% confidence intervals of the borderline score, namely border zone (BZ), under the standard 6-station, 2-rater, and 6-item setting were calculated. G-coefficients of in-person (2017-2019) and hybrid OSCEs (2020-2021) under the standard setting were estimated to be 0.624, 0.770, 0.782, 0.759, and 0.823, respectively. The BZ scores were estimated to be 2.43-3.57, 2.55-3.45, 2.59-3.41, 2.59-3.41, and 2.51-3.49, respectively, in the score range from 1 to 6. Although hybrid OSCEs showed reliability comparable to in-person OSCEs, they need further improvement as a very high-stakes examination. In addition to increasing clinical vignettes, having more proficient online/on-demand raters and/or online SPs for medical interviews could improve the reliability of OSCEs. Reliability can also be ensured through supplementary examination and by increasing the number of online raters for a small number of students within the BZs.
客观结构化临床考试(OSCE)被广泛用于评估医学生的临床技能。在 COVID-19 大流行期间,虚拟 OSCE 取代了面对面的 OSCE;然而,其可靠性尚未得到稳健分析。本研究通过应用概化(G)理论,旨在评估混合 OSCE 的可靠性,该混合 OSCE 混合了面对面和在线方法,并深入了解如何提高 OSCE 的可靠性。在 2020-2021 年混合 OSCE 期间,一名考生、一名评分者和一名用于体检的乙烯基人体模型在现场参加,一名标准化模拟患者(SP)用于医学访谈和另一名评分者在视听会议系统上的一个虚拟分组室中在线参加。在标准的 6 站、2 名评分者和 6 项设置下,计算了边界得分(BZ)的 G 系数和 95%置信区间,即边界区(BZ)。在标准设置下,面对面 OSCE(2017-2019 年)和混合 OSCE(2020-2021 年)的 G 系数估计分别为 0.624、0.770、0.782、0.759 和 0.823。在 1 到 6 的分数范围内,BZ 分数估计分别为 2.43-3.57、2.55-3.45、2.59-3.41、2.59-3.41 和 2.51-3.49。尽管混合 OSCE 的可靠性与面对面 OSCE 相当,但作为一项高风险考试,它们仍需要进一步改进。除了增加临床案例外,拥有更多熟练的在线/随需应变评分者和/或在线医学访谈 SP 可以提高 OSCE 的可靠性。通过补充考试并增加 BZ 内少数学生的在线评分者数量,也可以确保可靠性。