Department of Anesthesiology, Taipei Veterans General Hospital, Taipei, Taiwan, ROC.
J Chin Med Assoc. 2013 Jun;76(6):344-9. doi: 10.1016/j.jcma.2013.02.008. Epub 2013 Apr 18.
Student examinations are an essential component of medical education and item analyses are important to assess test quality. Among miscellaneous psychometric theories used for test analyses, item response theory is more flexible and versatile than other theories. This study aimed to apply item response models to analyze an anesthesiology examination for medical and dental students.
This examination included 50 items that were administered to 170 5(th)- and 6(th)-year medical and dental students. One- and two-parameter logistic (1-PL and 2-PL) item response models were used to conduct item analyses of the examination. Fit statistics were examined to exclude misfit items and evaluate test reliability. Goodness-of-fit analyses were used to select the model having the better fit to data. Examinee's ability and item difficulty were estimated and then expressed on the common scale. Potentially differential items were detected using logistic regression.
The goodness-of-fit analysis revealed that, in our case, the 1-PL model was more suitable for item response analyses. No misfit item was noted and the test reliability was 0.81 (1-PL model). The mean examinee's ability was set at 0 by definition [standard deviation (SD) = 0.61] and the mean item difficulty was -2.08 (SD = 1.93). There were 24 items with a difficulty level lower than the least able examinee, and three items had a difficulty level higher than the most able examinee. Four potentially differential items were identified.
Item response models are useful for medical test analyses and provide valuable information about model comparisons and identification of differential items other than test reliability, item difficulty, and examinee's ability.
学生考试是医学教育的重要组成部分,项目分析对于评估测试质量至关重要。在用于测试分析的各种心理测量理论中,项目反应理论比其他理论更灵活、用途更广。本研究旨在将项目反应模型应用于分析医学生和牙科学的考试。
该考试包含 50 个项目,对 170 名五年级和六年级的医学生和牙科学进行了测试。采用单参数逻辑斯蒂(1-PL)和双参数逻辑斯蒂(2-PL)项目反应模型对考试进行项目分析。检查拟合统计数据以排除不拟合的项目并评估测试的可靠性。良好拟合分析用于选择与数据拟合更好的模型。评估考生的能力和项目难度,然后在共同尺度上表示。使用逻辑回归检测潜在的差异项目。
拟合度分析表明,在我们的情况下,1-PL 模型更适合项目反应分析。没有发现不拟合的项目,测试的可靠性为 0.81(1-PL 模型)。根据定义,考生能力的平均值设定为 0(标准差 [SD] = 0.61),项目难度的平均值为 -2.08(SD = 1.93)。有 24 个项目的难度水平低于最不熟练的考生,有 3 个项目的难度水平高于最熟练的考生。确定了 4 个潜在的差异项目。
项目反应模型可用于医学测试分析,除了测试可靠性、项目难度和考生能力外,还提供有关模型比较和差异项目识别的有价值信息。