McGill D A, van der Vleuten C P M, Clarke M J
Department of Cardiology, The Canberra Hospital, Garran, ACT 2605, Australia.
Department of Educational Research and Development, Maastricht University, Maastricht, The Netherlands.
BMC Med Educ. 2015 Dec 30;15:237. doi: 10.1186/s12909-015-0520-1.
Evaluations of clinical assessments that use judgement-based methods have frequently shown them to have sub-optimal reliability and internal validity evidence for their interpretation and intended use. The aim of this study was to enhance that validity evidence by an evaluation of the internal validity and reliability of competency constructs from supervisors' end-of-term summative assessments for prevocational medical trainees.
The populations were medical trainees preparing for full registration as a medical practitioner (74) and supervisors who undertook ≥2 end-of-term summative assessments (n = 349) from a single institution. Confirmatory Factor Analysis was used to evaluate assessment internal construct validity. The hypothesised competency construct model to be tested, identified by exploratory factor analysis, had a theoretical basis established in workplace-psychology literature. Comparisons were made with competing models of potential competency constructs including the competency construct model of the original assessment. The optimal model for the competency constructs was identified using model fit and measurement invariance analysis. Construct homogeneity was assessed by Cronbach's α. Reliability measures were variance components of individual competency items and the identified competency constructs, and the number of assessments needed to achieve adequate reliability of R > 0.80.
The hypothesised competency constructs of "general professional job performance", "clinical skills" and "professional abilities" provides a good model-fit to the data, and a better fit than all alternative models. Model fit indices were χ2/df = 2.8; RMSEA = 0.073 (CI 0.057-0.088); CFI = 0.93; TLI = 0.95; SRMR = 0.039; WRMR = 0.93; AIC = 3879; and BIC = 4018). The optimal model had adequate measurement invariance with nested analysis of important population subgroups supporting the presence of full metric invariance. Reliability estimates for the competency construct "general professional job performance" indicated a resource efficient and reliable assessment for such a construct (6 assessments for an R > 0.80). Item homogeneity was good (Cronbach's alpha = 0.899). Other competency constructs are resource intensive requiring ≥11 assessments for a reliable assessment score.
Internal validity and reliability of clinical competence assessments using judgement-based methods are acceptable when actual competency constructs used by assessors are adequately identified. Validation for interpretation and use of supervisors' assessment in local training schemes is feasible using standard methods for gathering validity evidence.
对采用基于判断方法的临床评估进行的评价经常表明,就其解释和预期用途而言,这些评估的可靠性和内部效度证据欠佳。本研究的目的是通过评估预职业医学实习生主管期末总结性评估中能力结构的内部效度和可靠性,来增强该效度证据。
研究对象为准备获得医生完全注册资格的医学实习生(74名)和来自单一机构、进行过≥2次期末总结性评估的主管(n = 349名)。采用验证性因素分析来评估评估的内部结构效度。通过探索性因素分析确定的待测试的假设能力结构模型,在工作场所心理学文献中有理论依据。将其与潜在能力结构的竞争模型进行比较,包括原始评估的能力结构模型。使用模型拟合和测量不变性分析确定能力结构的最佳模型。通过Cronbach's α评估结构同质性。可靠性测量指标为各个能力项目和确定的能力结构的方差成分,以及为实现R > 0.80的充分可靠性所需的评估次数。
“一般专业工作表现”“临床技能”和“专业能力”的假设能力结构与数据拟合良好,且比所有替代模型拟合得更好。模型拟合指数为χ2/df = 2.8;RMSEA = 0.073(95%置信区间为0.057 - 0.088);CFI = 0.93;TLI = 0.95;SRMR = 0.039;WRMR = 0.93;AIC = 3879;BIC = 4018)。对重要人群亚组进行嵌套分析,结果表明最佳模型具有充分的测量不变性,支持完全度量不变性的存在。能力结构“一般专业工作表现”的可靠性估计表明,对该结构进行评估既高效又可靠(R > 0.80时需要6次评估)。项目同质性良好(Cronbach's α = 0.899)。其他能力结构需要大量资源,可靠评估分数需要≥11次评估。
当评估者使用的实际能力结构得到充分认同时,采用基于判断方法的临床能力评估的内部效度和可靠性是可以接受的。使用收集效度证据的标准方法,对当地培训计划中主管评估的解释和使用进行验证是可行的。