Centre for Educational Measurement at the University of Oslo (CEMO) and Centre for Health Sciences Education, University of Oslo, Oslo, Norway.
Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany.
Adv Health Sci Educ Theory Pract. 2018 Mar;23(1):217-232. doi: 10.1007/s10459-017-9771-4. Epub 2017 Mar 16.
Despite the frequent use of state-of-the-art psychometric models in the field of medical education, there is a growing body of literature that questions their usefulness in the assessment of medical competence. Essentially, a number of authors raised doubt about the appropriateness of psychometric models as a guiding framework to secure and refine current approaches to the assessment of medical competence. In addition, an intriguing phenomenon known as case specificity is specific to the controversy on the use of psychometric models for the assessment of medical competence. Broadly speaking, case specificity is the finding of instability of performances across clinical cases, tasks, or problems. As stability of performances is, generally speaking, a central assumption in psychometric models, case specificity may limit their applicability. This has probably fueled critiques of the field of psychometrics with a substantial amount of potential empirical evidence. This article aimed to explain the fundamental ideas employed in psychometric theory, and how they might be problematic in the context of assessing medical competence. We further aimed to show why and how some critiques do not hold for the field of psychometrics as a whole, but rather only for specific psychometric approaches. Hence, we highlight approaches that, from our perspective, seem to offer promising possibilities when applied in the assessment of medical competence. In conclusion, we advocate for a more differentiated view on psychometric models and their usage.
尽管在医学教育领域频繁使用最先进的心理计量学模型,但越来越多的文献对其在评估医学能力方面的有用性提出了质疑。本质上,一些作者对心理计量学模型作为指导框架来确保和完善当前评估医学能力的方法的适当性表示怀疑。此外,一种被称为案例特异性的有趣现象是与使用心理计量学模型评估医学能力的争议有关的特异性。广义地说,案例特异性是指在临床案例、任务或问题中表现的不稳定性。由于性能的稳定性通常是心理计量学模型的一个核心假设,因此案例特异性可能会限制其适用性。这可能助长了对心理计量学领域的批评,因为有大量潜在的实证证据。本文旨在解释心理计量理论中使用的基本思想,以及它们在评估医学能力方面可能存在的问题。我们进一步旨在展示为什么以及如何对于整个心理计量学领域来说,有些批评并不成立,而只是针对特定的心理计量学方法。因此,我们强调了从我们的角度来看,在评估医学能力时似乎具有有前途的可能性的方法。总之,我们主张对心理计量模型及其使用采取更具差异化的观点。