Center for Research and Innovation in Medical Education, University of Groningen and University Medical Center Groningen, Groningen, The Netherlands.
Med Teach. 2010;32(2):154-60. doi: 10.3109/01421590903196979.
Teachers involved in test development usually prefer criterion-referenced standard setting methods using panels. Since expert panels are costly, standards are often set by a pre-fixed percentage of questions answered correctly or norm-referenced methods aimed at ranking examinees.
To discuss the (dis)advantages of commonly used criterion and norm-referenced methods and present a new compromise method: standards based on a fixed cut-off score using the best scoring students as reference point.
Historical data from 54 Maastricht (norm-referenced) and 52 Groningen (criterion-referenced) tests were used to demonstrate huge discrepancies and variability in cut-off scores and failure rates. Subsequently, the compromise model - known as Cohen's method - was applied to the Groningen tests.
The Maastricht norm-referenced method led to a large variation in required cut-off scores (15-46%), but a stable failure rate (about 17%). The Groningen method with a conventional, pre-fixed standard of 60% led to a large variation in failure rates (17-97%). The compromise method reduced variation in required cut-off scores as well as failure rates.
Both the criterion and norm-referenced standards, used in practice, have disadvantages. The proposed compromise model reduces the disadvantages of both methods and is considered more acceptable. Last but not least, compared to standard setting methods using panels, this method is affordable.
参与测试开发的教师通常更喜欢使用专家小组的基于准则的标准设定方法。由于专家小组成本高昂,因此通常会通过固定比例的正确回答问题或旨在对考生进行排名的常模参照方法来设定标准。
讨论常用的基于准则和基于常模的方法的优缺点,并提出一种新的折衷方法:基于固定截止分数的标准,以最佳得分学生为参考点。
使用来自 54 个马斯特里赫特(常模参照)和 52 个格罗宁根(准则参照)测试的历史数据,展示了截止分数和失败率之间存在巨大差异和可变性。随后,将妥协模型(即科恩方法)应用于格罗宁根测试。
马斯特里赫特的常模参照方法导致所需截止分数的变化很大(15-46%),但失败率稳定(约 17%)。采用 60%的传统固定标准的格罗宁根方法导致失败率的变化很大(17-97%)。妥协方法降低了所需截止分数和失败率的变化。
实践中使用的基于准则和基于常模的标准都有其缺点。所提出的折衷模型减少了两种方法的缺点,被认为更可接受。最后但并非最不重要的是,与使用小组进行标准设定的方法相比,这种方法更经济实惠。