Cranton P A, Dauphinee W D, McQueen M M, Smith L P
Res Med Educ. 1984;23:59-64.
In a specific Obstetrics and Gynecological program, the program and certifying ITERs were evaluated for their measurement qualities. The internal consistency of the ITERs is supported. The tendency for high inter-item correlations suggest overall judgment of candidates may be influencing individual item rankings--particularly on the Program ITER. Unfamiliarity of faculty with appropriate behaviors may be one of the reasons for this effect based on the faculty's inability to select correct behavior for each item. Very limited inter-form consistency is noted and random associations of items often correlate higher than parallel items. The stability of the Program ITER is supported, but there is little support for criterion validity based on the criterion variables available. It is concluded that more clearly defined behaviors must be identified for each ITER item and faculty must be trained in their use. The use of the same ITER for all specialities may be a major reason for this inconsistency. Lastly, more studies of validity are advised.
在一个特定的妇产科项目中,对项目ITER和认证ITER的测量质量进行了评估。ITER的内部一致性得到了支持。项目间相关性较高的趋势表明,对候选人的整体判断可能会影响单个项目的排名——尤其是在项目ITER上。基于教员无法为每个项目选择正确行为,教员对适当行为的不熟悉可能是造成这种影响的原因之一。注意到表格间的一致性非常有限,并且项目的随机关联往往比平行项目的相关性更高。项目ITER的稳定性得到了支持,但基于现有的标准变量,几乎没有证据支持标准效度。得出的结论是,必须为每个ITER项目确定更明确的行为,并且教员必须接受使用这些行为的培训。对所有专业使用相同的ITER可能是造成这种不一致的主要原因。最后,建议进行更多的效度研究。