National Board of Medical Examiners, 3750 Market Street, Philadelphia, PA 19104, USA.
Acad Med. 2009 Oct;84(10 Suppl):S83-5. doi: 10.1097/ACM.0b013e3181b37d01.
Previous research has shown that ratings of English proficiency on the United States Medical Licensing Examination Clinical Skills Examination are highly reliable. However, the score distributions for native and nonnative speakers of English are sufficiently different to suggest that reliability should be investigated separately for each group.
Generalizability theory was used to obtain reliability indices separately for native and nonnative speakers of English (N = 29,084). Conditional standard errors of measurement were also obtained for both groups to evaluate measurement precision for each group at specific score levels.
Overall indices of reliability (phi) exceeded 0.90 for both native and nonnative speakers, and both groups were measured with nearly equal precision throughout the score distribution. However, measurement precision decreased at lower levels of proficiency for all examinees.
The results of this and future studies may be helpful in understanding and minimizing sources of measurement error at particular regions of the score distribution.
先前的研究表明,美国医师执照考试临床技能考试的英语熟练程度评分具有高度可靠性。然而,英语为母语者和非母语者的分数分布差异很大,这表明应该分别为每个群体进行可靠性研究。
使用概化理论分别为英语母语者和非母语者(N=29084)获得可靠性指标。还为两组获得了条件测量标准误差,以评估每个群体在特定分数水平上的测量精度。
母语者和非母语者的整体可靠性指数(phi)均超过 0.90,并且两组在整个分数分布中都具有几乎相同的测量精度。然而,对于所有考生来说,在较低的熟练水平下,测量精度会下降。
本研究和未来研究的结果可能有助于理解和最小化分数分布特定区域的测量误差源。