Johnson D, Cujec B
Department of Critical Care, Royal University Hospital, Saskatoon, SK, Canada.
Crit Care Med. 1998 Nov;26(11):1811-6. doi: 10.1097/00003246-199811000-00020.
Compare resident evaluations by self, nurses, and attending physicians.
Prospective cohort.
University intensive care unit.
Sixty residents.
End-rotational evaluation using a standardized, multiple-choice examination and one of two subjective instruments, Global Rating Scale and Behaviorally Anchored Rating Scale.
Means for overall competence, using both the Behaviorally Anchored Rating Scale and the Global Rating Scale clustered between 3 to 4 on a 5-point scale. Physicians' evaluations correlated with the multiple-choice test scores (Spearman's rho 0.3082, p = .005, n = 82), whereas neither self-evaluation (Spearman's rho 0.1124, p = .65, n = 42) nor nurses' evaluations (Spearman's rho 0.2060, p = .069, n = 79) had a significant correlation with test scores. Spearman's correlations were not significant for either overall competence or specific medical knowledge by any category of evaluator using the Global Rating Scale. Spearman's rho correlations and kappa statistic between the three types of evaluators (physicians, nurses, and self) for each criterion of the Behaviorally Anchored Rating Scale demonstrated significant correlations between the ratings of physicians and nurses, except for the assessment of humanistic qualities. Pooled clinical skills-history taking (b = 0.277, p <.009), humanistic qualities (b = 0.607, p <.000), and professional attitudes and behavior (b = 0.488, p < .000) systematically differed in ratings comparing self with nurse and physician (by analysis of variance). The explanatory power of the model of ratings (independent variables of year of residency, category of evaluator, evaluation criteria, and interaction terms) was 47.3% (r2adj).
Self-rating by residents did not correlate to multiple-choice test scores and differed in some criteria with physicians' or nurses' evaluations. We found many similarities and some differences between physicians' and nurses' evaluations of residents. We speculate that different categories of evaluators assess different aspects of performance. Assessment by a varied group of evaluators should be used when attempts to predict future practice are made.
比较住院医师自我评估、护士评估和主治医生评估的结果。
前瞻性队列研究。
大学重症监护病房。
60名住院医师。
使用标准化多项选择题考试以及两种主观评估工具(整体评分量表和行为锚定评分量表)之一进行轮转结束时的评估。
使用行为锚定评分量表和整体评分量表得出的总体能力均值在5分制中集中在3至4分之间。医生的评估与多项选择题考试成绩相关(斯皮尔曼等级相关系数ρ为0.3082,p = 0.005,n = 82),而自我评估(斯皮尔曼等级相关系数ρ为0.1124,p = 0.65,n = 42)和护士评估(斯皮尔曼等级相关系数ρ为0.2060,p = 0.069,n = 79)与考试成绩均无显著相关性。对于使用整体评分量表的任何评估者类别,总体能力或特定医学知识的斯皮尔曼相关性均不显著。行为锚定评分量表各标准下三种评估者(医生、护士和自我)之间的斯皮尔曼等级相关系数和kappa统计量表明,除人文素质评估外,医生和护士的评分之间存在显著相关性。通过方差分析比较自我与护士及医生的评分时,综合临床技能——病史采集(b = 0.277,p < 0.009)、人文素质(b = 0.607,p < 0.000)以及专业态度和行为(b = 0.488,p < 0.000)在评分上存在系统性差异。评分模型(住院年限、评估者类别、评估标准和交互项的自变量)的解释力为47.3%(调整后r2)。
住院医师的自我评分与多项选择题考试成绩无关,且在某些标准上与医生或护士的评估不同。我们发现医生和护士对住院医师的评估存在许多相似之处和一些差异。我们推测不同类别的评估者评估表现的不同方面。在试图预测未来实践时,应采用由不同评估者组成的群体进行评估。