Lee-Jayaram Jannet, Ishikawa Kyle, Lee Yu Jin, Tanaka Len Y, Lee Benjamin, Liang Bao Xin, Berg Benjamin W
SimTiki Simulation Center, John A. Burns School of Medicine, University of Hawai'i at Mānoa, Honolulu, USA.
Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawai'i at Mānoa, Honolulu, USA.
Cureus. 2024 Dec 11;16(12):e75570. doi: 10.7759/cureus.75570. eCollection 2024 Dec.
Introduction Debriefing in healthcare simulation is helpful in reinforcing learning objectives, closing performance gaps, and improving future practice and patient care. The Debriefing Assessment for Simulation in Healthcare (DASH) is a validated tool. However, localized rater training for the DASH has not been described. We sought to augment DASH anchors with localized notations, localize DASH rater training, assess localized DASH correlation with other debriefing practices/factors, and assess reliability of the localized tool. Methods This study was conducted at SimTiki Simulation Center, John A. Burns School of Medicine, University of Hawai'i at Mānoa. Three simulation experts without DASH training developed a list of debriefing best practices/factors, reviewed the DASH handbook, and transcribed the DASH Rater Long Form version with example behaviors into a rating document. Research assistants recorded best practices/factor data from archived debriefing videos. Simulation experts independently scored debriefings, resolved discrepancies and added localized criteria to the DASH. Rater calibration was completed with an intraclass correlation (ICC) of 0.884. Raters then independently scored 43 debriefings recorded during July-December 2022. DASH scores were compared to observed debriefing best practices/factors. Results The overall DASH behaviors ICC agreement was 0.810 and consistency was 0.825. Behavior scores ranged from 2.45 (SD 0.70) to 4.42 (SD 0.81). The three lowest scoring DASH behaviors were 2A (2.45), 4A (3.41), and 3B (3.44). In behavior 2B regarding realism concerns, there was significant inconsistency in the use of the not applicable (NA) designation. DASH and best practices/factors construct correlation supported convergent validity. Conclusion High interrater reliability followed localized rater training and the addition of localized notations to DASH. Correlation with debriefing practices/factors strengthens DASH validity evidence. Lower DASH behavior scores reported generally suggest that the interpretation of DASH scores is best contextualized in a shared localized DASH construct. A comparison of DASH numerical scores between institutions, with different raters, and different cultures may not reflect absolute debriefing quality.
引言
医疗模拟中的总结汇报有助于强化学习目标、缩小绩效差距并改善未来的实践和患者护理。医疗模拟总结汇报评估(DASH)是一种经过验证的工具。然而,尚未有关于DASH本地化评分员培训的描述。我们试图用本地化注释扩充DASH锚点,对DASH评分员培训进行本地化,评估本地化DASH与其他总结汇报实践/因素的相关性,并评估本地化工具的可靠性。
方法
本研究在夏威夷大学马诺阿分校约翰·A·伯恩斯医学院的SimTiki模拟中心进行。三位未接受过DASH培训的模拟专家制定了一份总结汇报最佳实践/因素清单,查阅了DASH手册,并将带有示例行为的DASH评分员长表版本转录到一份评分文件中。研究助理从存档的总结汇报视频中记录最佳实践/因素数据。模拟专家独立对总结汇报进行评分,解决差异并将本地化标准添加到DASH中。评分员校准完成时组内相关系数(ICC)为0.884。然后评分员独立对2022年7月至12月期间记录的43次总结汇报进行评分。将DASH评分与观察到的总结汇报最佳实践/因素进行比较。
结果
DASH总体行为的ICC一致性为0.810,一致性为0.825。行为评分范围为2.45(标准差0.70)至4.42(标准差0.81)。得分最低的三个DASH行为是2A(2.45)、4A(3.41)和3B(3.44)。在关于现实主义问题的行为2B中,“不适用”(NA)标识的使用存在显著不一致。DASH与最佳实践/因素结构的相关性支持了收敛效度。
结论
本地化评分员培训以及在DASH中添加本地化注释后,评分员间可靠性较高。与总结汇报实践/因素的相关性加强了DASH有效性证据。普遍较低的DASH行为评分表明,最好在共享的本地化DASH结构中对DASH评分的解释进行背景化。不同机构、不同评分员和不同文化之间的DASH数值评分比较可能无法反映总结汇报的绝对质量。