Suppr超能文献

i-Assess:评估 OSCE 中电子数据采集的影响。

i-Assess: Evaluating the impact of electronic data capture for OSCE.

机构信息

Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Canada.

Touchstone Institute, Toronto, Canada.

出版信息

Perspect Med Educ. 2018 Apr;7(2):110-119. doi: 10.1007/s40037-018-0410-4.

Abstract

INTRODUCTION

Tablet-based assessments offer benefits over scannable-paper assessments; however, there is little known about the impact to the variability of assessment scores.

METHODS

Two studies were conducted to evaluate changes in rating technology. Rating modality (paper vs tablets) was manipulated between candidates (Study 1) and within candidates (Study 2). Average scores were analyzed using repeated measures ANOVA, Cronbach's alpha and generalizability theory. Post-hoc analyses included a Rasch analysis and McDonald's omega.

RESULTS

Study 1 revealed a main effect of modality (F (1,152) = 25.06, p < 0.01). Average tablet-based scores were higher, (3.39/5, 95% CI = 3.28 to 3.51), compared with average scan-sheet scores (3.00/5, 95% CI = 2.90 to 3.11). Study 2 also revealed a main effect of modality (F (1, 88) = 15.64, p < 0.01), however, the difference was reduced to 2% with higher scan-sheet scores (3.36, 95% CI = 3.30 to 3.42) compared with tablet scores (3.27, 95% CI = 3.21 to 3.33). Internal consistency (alpha and omega) remained high (>0.8) and inter-station reliability remained constant (0.3). Rasch analyses showed no relationship between station difficulty and rating modality.

DISCUSSION

Analyses of average scores may be misleading without an understanding of internal consistency and overall reliability of scores. Although updating to tablet-based forms did not result in systematic variations in scores, routine analyses ensured accurate interpretation of the variability of assessment scores.

CONCLUSION

This study demonstrates the importance of ongoing program evaluation and data analysis.

摘要

简介

基于平板电脑的评估相较于可扫描纸质评估具有优势;然而,关于评估分数变异性的影响却知之甚少。

方法

进行了两项研究以评估评分技术的变化。在研究 1 中,被试者之间(研究 1)和被试者内部(研究 2)操纵了评分模式(纸质与平板电脑)。使用重复测量方差分析、克朗巴赫的α和概化理论对平均分数进行分析。事后分析包括拉什分析和麦克唐纳的ω。

结果

研究 1 显示了模式的主要效果(F(1,152)= 25.06,p < 0.01)。基于平板电脑的平均分数更高,(3.39/5,95%置信区间= 3.28 至 3.51),而扫描纸的平均分数为(3.00/5,95%置信区间= 2.90 至 3.11)。研究 2 也显示了模式的主要效果(F(1,88)= 15.64,p < 0.01),然而,扫描纸的分数更高,差异缩小到 2%(3.36,95%置信区间= 3.30 至 3.42),而平板电脑的分数为(3.27,95%置信区间= 3.21 至 3.33)。内部一致性(α和ω)保持较高(>0.8),站点间可靠性保持不变(0.3)。拉什分析显示,站点难度与评分模式之间没有关系。

讨论

如果不了解分数的内部一致性和整体可靠性,对平均分数的分析可能会产生误导。尽管更新为基于平板电脑的形式并没有导致分数的系统性变化,但常规分析确保了评估分数变异性的准确解释。

结论

本研究表明持续的项目评估和数据分析的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1787/5889381/c6cb143ff86e/40037_2018_410_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验