National Board of Medical Examiners, 3750 Market Street, Philadelphia, PA, 19104, USA.
Adv Health Sci Educ Theory Pract. 2019 Mar;24(1):141-150. doi: 10.1007/s10459-018-9855-9. Epub 2018 Oct 25.
Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees. A limitation of this definition is that the proportion of examinees available to choose a distractor depends on overall item difficulty. This is especially problematic for mastery tests, which consist of items that most examinees are expected to answer correctly. Based on the traditional definition of nonfunctional, a five-option MCQ answered correctly by greater than 90% of examinees will be constrained to have only one functional distractor. The primary purpose of the present study was to evaluate an index of nonfunctional that is sensitive to item difficulty. A secondary purpose was to extend previous research by studying distractor functionality within the context of professionally-developed credentialing tests. Data were analyzed for 840 MCQs consisting of five options per item. Results based on the traditional definition of nonfunctional were consistent with previous research indicating that most MCQs had one or two functional distractors. In contrast, the newly proposed index indicated that nearly half (47.3%) of all items had three or four functional distractors. Implications for item and test development are discussed.
研究表明,三选项格式最适合多项选择题(MCQ)。许多研究表明,大多数干扰项(即错误答案)被如此少的考生选择,以至于它们基本上是没有作用的,这一结论得到了支持。然而,几乎所有的研究都将干扰项定义为如果被少于 5%的考生选择,则为无作用。这种定义的一个局限性是,选择干扰项的考生比例取决于整体项目难度。对于掌握性测试来说,这尤其成问题,因为这些测试包含大多数考生都应该正确回答的项目。根据无作用的传统定义,大于 90%的考生正确回答的五选项 MCQ 将被限制为只有一个有作用的干扰项。本研究的主要目的是评估一个对项目难度敏感的无作用指标。次要目的是通过研究专业开发的认证测试中的干扰项功能,扩展以前的研究。对包含每个项目五个选项的 840 个 MCQ 进行了数据分析。基于无作用的传统定义的结果与之前的研究一致,表明大多数 MCQ 只有一个或两个有作用的干扰项。相比之下,新提出的指标表明,近一半(47.3%)的项目有三个或四个有作用的干扰项。讨论了对项目和测试开发的影响。