Battles J B, Carpenter J L, McIntire D D, Wagner J M
Office of Medical Education, University of Texas Southwestern Medical Center, Dallas Southwestern Medical School 75235-9065.
Acad Med. 1994 May;69(5):370-6. doi: 10.1097/00001888-199405000-00010.
Structuring a clinical performance examination that uses standardized patients (SPs) for large groups of examinees often involves the use of two or more parallel forms of the examination with different SPs portraying the same case on the different forms. In addition, each form may be administered more than once on different days and/or in different locations.
To determine the effects of critical variables, such as day of examination, time of day (AM/PM), which of two simultaneous forms were taken, and sequencing effects, a univariate nested factorial analysis of variance was conducted for each of four annual SP examinations (1990-1993) at the University of Texas Southwestern Medical School. The examinations were given to approximately 200 second-year students per year at the end of their Introduction to Clinical Medicine course, and were graded on a pass/fail basis.
Statistically significant differences were found for the following variables: (1) time of day (AM or PM) and day were significant but were inconsistent and of small magnitude; (2) sequencing for the first two stations was significant in each form of the examination and in all four years; and (3) form-within-case differences (i.e., differences between SPs) were significant between the two forms of the examination in each year of administration. To minimize the impacts of these variables, two mean equating formulas were applied to the scores. Few examinees' pass/fail status would have been affected by either adjustment.
The parallel-forms examination format is minimally affected by the variables evaluated and is a fair pass/fail assessment of a student's performance. Mean equating is a valuable tool in minimizing the possibly unfair impact of variables on pass/fail decisions for homogeneous student populations.
为大量考生构建使用标准化病人(SPs)的临床技能考试通常需要使用两种或更多平行版本的考试,不同的标准化病人在不同版本中演绎相同病例。此外,每个版本可能会在不同日期和/或不同地点多次施测。
为确定关键变量的影响,如考试日期、一天中的时间(上午/下午)、所采用的两个同步版本中的哪一个以及顺序效应,对德克萨斯大学西南医学中心1990 - 1993年期间的四次年度标准化病人考试中的每一次进行了单变量嵌套析因方差分析。这些考试在每年临床医学导论课程结束时对约200名二年级学生进行,成绩评定为通过/未通过。
发现以下变量存在统计学上的显著差异:(1)一天中的时间(上午或下午)和日期有显著差异,但不一致且幅度较小;(2)每次考试的每个版本以及所有四年中,前两个站点的顺序效应都很显著;(3)在每年施测的两个考试版本之间,病例内版本差异(即标准化病人之间的差异)显著。为尽量减少这些变量的影响,对分数应用了两个均值等值公式。两种调整方式几乎都不会影响考生的通过/未通过状态。
平行版本考试形式受所评估变量的影响最小,是对学生表现进行公平的通过/未通过评估的方式。均值等值是一种有价值的工具,可最大程度减少变量对同质学生群体通过/未通过决策可能产生的不公平影响。