Suppr超能文献

夸大的临床评估:基于行为锚定评估数据的教师选择的总体评估与数学计算的总体评估之比较

Inflated Clinical Evaluations: a Comparison of Faculty-Selected and Mathematically Calculated Overall Evaluations Based on Behaviorally Anchored Assessment Data.

作者信息

Meyer Eric G, Cozza Kelly L, Konara Riley M R, Hamaoka Derrick, West James C

机构信息

Uniformed Services University of the Health Sciences, Bethesda, MD, USA.

Reynolds Army Community Hospital, Fort Sill, OK, USA.

出版信息

Acad Psychiatry. 2019 Apr;43(2):151-156. doi: 10.1007/s40596-018-0957-8. Epub 2018 Aug 8.

Abstract

OBJECTIVE

This retrospective study compared faculty-selected evaluation scores with those mathematically calculated from behaviorally anchored assessments.

METHODS

Data from 1036 psychiatry clerkship clinical evaluations (2012-2015) was reviewed. These clinical evaluations required faculty to assess clinical performance using 14 behaviorally anchored questions followed by a faculty-selected overall evaluation. An explicit rubric was included in the overall evaluation to assist the faculty in interpreting their 14 assessment responses. Using the same rubric, mathematically calculated evaluations of the same assessment responses were generated and compared to the faculty-selected evaluations.

RESULTS

Comparison of faculty-selected to mathematically calculated evaluations revealed that while the two methods were reliably correlated (Cohen's kappa = 0.314, Pearson's coefficient = 0.658, p < 0.001), there was a notable difference in the results (t = 24.5, p < 0.0001). The average faculty-selected evaluation was 1.58 (SD = 0.61) with a mode of "1" or "outstanding," while the mathematically calculated evaluation had an average of 2.10 (SD = 0.90) with a mode of "3" or "satisfactory." 51.0% of the faculty-selected evaluations matched the mathematically calculated results: 46.1% were higher and 2.9% were lower.

CONCLUSIONS

Clerkship clinical evaluation forms that require faculty to make an overall evaluation generate results that are significantly higher than what would have been assigned solely using behavioral anchored assessment questions. Focusing faculty attention on assessing specific behaviors rather than overall evaluations may reduce this inflation and improve validity. Clerkships may want to consider removing overall evaluation questions from their clinical evaluation tools.

摘要

目的

本回顾性研究比较了教师选择的评估分数与通过行为锚定评估进行数学计算得出的分数。

方法

回顾了1036份精神科实习临床评估(2012 - 2015年)的数据。这些临床评估要求教师使用14个行为锚定问题评估临床绩效,随后进行教师选择的总体评估。总体评估中包含一个明确的评分标准,以帮助教师解释他们的14项评估回答。使用相同的评分标准,对相同评估回答进行数学计算得出评估结果,并与教师选择的评估结果进行比较。

结果

教师选择的评估与数学计算得出的评估结果比较显示,虽然两种方法具有可靠的相关性(科恩kappa系数 = 0.314,皮尔逊系数 = 0.658,p < 0.001),但结果存在显著差异(t = 24.5,p < 0.0001)。教师选择的平均评估分数为1.58(标准差 = 0.61),众数为“1”或“优秀”,而数学计算得出的评估平均分数为2.10(标准差 = 0.90),众数为“3”或“满意”。51.0%的教师选择的评估与数学计算结果相符:46.1%更高,2.9%更低。

结论

要求教师进行总体评估的实习临床评估表产生的结果显著高于仅使用行为锚定评估问题得出的结果。将教师的注意力集中在评估特定行为而非总体评估上,可能会减少这种分数虚高并提高效度。实习项目可能需要考虑从其临床评估工具中删除总体评估问题。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验