Centre Academique de Medecine Generale, Universite catholique de Louvain, Avenue Emmanuel Mounier 53 (boite 5360), 1200, Brussels, Belgium.
Centre de Pedagogie Appliquee aux Sciences de la Sante, Universite de Montreal, Montreal, Canada.
Adv Health Sci Educ Theory Pract. 2010 Mar;15(1):55-63. doi: 10.1007/s10459-009-9169-z. Epub 2009 Jun 4.
Case-specificity, i.e., variability of a subject's performance across cases, has been a consistent finding in medical education. It has important implications for assessment validity and reliability. Its root causes remain a matter of discussion. One hypothesis, content-specificity, links variability of performance to variable levels of relevant knowledge. Extended-matching items (EMIs) are an ideal format to test this hypothesis as items are grouped by topic. If differences pertaining to content knowledge are the main cause of case-specificity, variability across topics should be high and variability across items within the same topic low. We used generalisability analysis on results of a written test composed of 159 EMIs sat by two cohorts of general practice trainees at one university. Two hundred and twenty-seven trainees took part. The variance component attributed to subjects was small. Variance attributed to topics was smaller than variance attributed to items. The main source of error was interaction between subjects and items, accounting for two-thirds of error. The generalisability D study revealed that for the same total number of items, increasing the number of topics results in a higher G coefficient than increasing the number of items per topic. Topical knowledge does not seem to explain case-specificity observed in our data. Structure of knowledge and reasoning strategy may be more important, in particular pattern-recognition which EMIs were designed to elicit. The causal explanations of case-specificity may be dependent on test format. Increasing the number of topics with fewer items each would increase reliability but also testing time.
病例特异性,即个体在不同病例中的表现存在差异,这在医学教育中是一个一致的发现。它对评估的有效性和可靠性有重要影响。其根本原因仍在讨论之中。有一种假设认为,内容特异性将表现的可变性与相关知识的不同水平联系起来。扩展匹配项目(EMI)是测试这一假设的理想格式,因为项目是按主题分组的。如果导致病例特异性的主要原因是内容知识的差异,那么跨主题的可变性应该较高,而同一主题内的项目之间的可变性应该较低。我们使用可概括性分析对由一个大学的两个普通实践培训生队列完成的包含 159 个 EMI 的笔试结果进行了分析。共有 227 名学员参加了考试。归因于被试的方差分量很小。归因于主题的方差小于归因于项目的方差。误差的主要来源是被试和项目之间的相互作用,占误差的三分之二。研究结果显示,对于相同数量的总项目,增加主题的数量比增加每个主题的项目数量会导致更高的 G 系数。在我们的数据中,主题知识似乎并不能解释观察到的病例特异性。知识结构和推理策略可能更为重要,特别是 EMI 旨在引出的模式识别。病例特异性的因果解释可能取决于测试格式。增加每个主题的项目数量会增加可靠性,但也会增加测试时间。