Suppr超能文献

使用个体拟合统计量检测评分者偏差:一项蒙特卡罗模拟研究。

Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study.

机构信息

Université de Sherbrooke, Sherbrooke, Québec, Canada.

Université Laval, Québec, Québec, Canada.

出版信息

Perspect Med Educ. 2018 Apr;7(2):83-92. doi: 10.1007/s40037-017-0391-8.

Abstract

INTRODUCTION

With the Standards voicing concern for the appropriateness of response processes, we need to explore strategies that would allow us to identify inappropriate rater response processes. Although certain statistics can be used to help detect rater bias, their use is complicated by either a lack of data about their actual power to detect rater bias or the difficulty related to their application in the context of health professions education. This exploratory study aimed to establish the worthiness of pursuing the use of l to detect rater bias.

METHODS

We conducted a Monte Carlo simulation study to investigate the power of a specific detection statistic, that is: the standardized likelihood l person-fit statistics (PFS). Our primary outcome was the detection rate of biased raters, namely: raters whom we manipulated into being either stringent (giving lower scores) or lenient (giving higher scores), using the l statistic while controlling for the number of biased raters in a sample (6 levels) and the rate of bias per rater (6 levels).

RESULTS

Overall, stringent raters (M = 0.84, SD = 0.23) were easier to detect than lenient raters (M = 0.31, SD = 0.28). More biased raters were easier to detect then less biased raters (60% bias: 62, SD = 0.37; 10% bias: 43, SD = 0.36).

DISCUSSION

The PFS l seems to offer an interesting potential to identify biased raters. We observed detection rates as high as 90% for stringent raters, for whom we manipulated more than half their checklist. Although we observed very interesting results, we cannot generalize these results to the use of PFS with estimated item/station parameters or real data. Such studies should be conducted to assess the feasibility of using PFS to identify rater bias.

摘要

简介

随着标准对反应过程的适当性表示关注,我们需要探索能够识别不适当评分者反应过程的策略。虽然某些统计数据可用于帮助检测评分者偏差,但由于缺乏有关其实际检测评分者偏差能力的数据,或者由于其在医疗保健职业教育背景下应用的难度,这些数据的使用变得复杂。本探索性研究旨在确定使用 l 来检测评分者偏差的价值。

方法

我们进行了一项蒙特卡罗模拟研究,以调查特定检测统计量的功效,即:标准化似然 l 个体拟合统计量(PFS)。我们的主要结果是偏倚评分者的检测率,即:我们通过 l 统计量操纵为严格(给出较低分数)或宽松(给出较高分数)的评分者,同时控制样本中偏倚评分者的数量(6 个水平)和每个评分者的偏差率(6 个水平)。

结果

总体而言,严格的评分者(M=0.84,SD=0.23)比宽松的评分者(M=0.31,SD=0.28)更容易检测。更多的偏倚评分者比更少的偏倚评分者更容易检测(60%的偏差:62,SD=0.37;10%的偏差:43,SD=0.36)。

讨论

PFS l 似乎为识别有偏差的评分者提供了一个有趣的潜力。我们观察到严格评分者的检测率高达 90%,我们对其操纵了超过一半的检查表。尽管我们观察到了非常有趣的结果,但我们不能将这些结果推广到使用 PFS 估计项目/站参数或真实数据。应该进行此类研究,以评估使用 PFS 识别评分者偏差的可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9820/5889374/54082edd488e/40037_2017_391_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验