University of Nottingham, UK.
Med Teach. 2011;33(6):447-58. doi: 10.3109/0142159X.2011.564682.
One of the key goals of assessment in medical education is the minimisation of all errors influencing a test in order to produce an observed score which approaches a learner's 'true' score, as reliably and validly as possible. In order to achieve this, assessors need to be aware of the potential biases that can influence all components of the assessment cycle from question creation to the interpretation of exam scores. This Guide describes and explains the processes whereby objective examination results can be analysed to improve the validity and reliability of assessments in medical education. We cover the interpretation of measures of central tendency, measures of variability and standard scores. We describe how to calculate the item-difficulty index and item-discrimination index in examination tests using different statistical procedures. This is followed by an overview of reliability estimates. The post-examination analytical methods described in this guide enable medical educators to construct reliable and valid achievement tests. They also enable medical educators to develop question banks using the collection of appropriate questions from existing examination tests in order to use computerised adaptive testing.
医学教育评估的主要目标之一是尽量减少影响测试的所有错误,以便尽可能可靠和有效地产生接近学习者“真实”分数的观察分数。为了实现这一目标,评估者需要意识到可能影响评估周期各个环节的潜在偏差,从问题创建到考试成绩的解释。本指南描述并解释了如何分析客观考试成绩,以提高医学教育评估的有效性和可靠性。我们涵盖了集中趋势度量、变异性度量和标准分数的解释。我们描述了如何使用不同的统计程序计算考试测试中的项目难度指数和项目区分指数。接下来是可靠性估计的概述。本指南中描述的考试后分析方法使医学教育者能够构建可靠和有效的成就测试。它们还使医学教育者能够使用从现有考试测试中收集的适当问题来开发题库,以便使用计算机自适应测试。