• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

调整住院医师评分的宽松或严格程度可提高麻醉科医师常规住院医师评估的可靠性。

Adjusting for Resident Rater Leniency or Severity Improves the Reliability of Routine Resident Evaluations of Faculty Anesthesiologists.

作者信息

Dexter Franklin, Vasilopoulos Terrie, Fahy Brenda G

机构信息

Anesthesia, University of Iowa, Iowa City, USA.

Anesthesiology/Orthopedics and Rehabilitation, University of Florida College of Medicine, Gainesville, USA.

出版信息

Cureus. 2025 Jun 19;17(6):e86366. doi: 10.7759/cureus.86366. eCollection 2025 Jun.

DOI:10.7759/cureus.86366
PMID:40688984
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12276026/
Abstract

Background The Accreditation Council for Graduate Medical Education (ACGME) of the United States requires all programs to evaluate faculty performance annually. Multiple universities require all faculty to be reviewed annually. These high-stakes evaluations should be reliable. When one anesthesiologist is said to perform better than another, there should be neither frequent Type I errors (i.e., an anesthesiologist is determined to perform better or worse than average when their performance is average) nor Type II errors (i.e., failure to detect above or below average performance) We investigated the generalizability of the finding that if adjustment is not made for rater leniency/severity, results will be statistically unreliable. Methods University of Florida 11-item evaluations were sent on Mondays, over the 2018-19 academic year. 108 ratees (anesthesiologists) had 3302 evaluations by 85 raters (resident physicians). The replicability of the results was assessed by making a comparison with previously published findings from the University of Toronto and the University of Iowa. Results As observed at the University of Toronto, there was greater heterogeneity of scores among raters than among ratees (raters' eta-squared 0.40; ratees' 0.22). As observed at the University of Iowa,the Florida rater leniency/severity of scores could not validly be modeled based on a normal distribution, because the distribution of each rater's mean among raters was not normally distributed (Shapiro-Wilk W = 0.90 (P = 0.00002) among the 75 raters with ≤9 evaluations). Likewise, matching Iowa,Florida's distribution of each ratee's mean among ratees was not normally distributed (W = 0.91 (P = 0.00001) among the 94 ratees with ≤9 evaluations). In contrast, treat evaluations with all items scored the maximum as having a value of 1, otherwise 0. As for Iowa,Florida's corresponding probability distributions of logits were normally distributed (W = 0.99 (P = 0.90) among raters and W = 0.98 (P = 0.09) among ratees, respectively). Rater leniency/severity remained large in the logit scale, with an intraclass correlation coefficient of 0.55. In the original scale, 0/108 ratees had performance that differed significantly from the grand mean of 4.63, using a P < 0.01 criterion. The alternative analysis approach adjusted for the raters' leniency/severity. Seven ratees were significantly below average (P ≤ 0.0048) and 17 above average (P ≤ 0.0086). Because statistical assumptions were satisfied, analysis in the original scale had a 22% (24/108) false negative rate, like the 21% observed previously at the University of Iowa. Conclusions Routine evaluations of faculty anesthesiologist ratees by anesthesiology resident raters give statistically unreliable results, falsely categorizing performance, unless analyses are adjusted for the covariates of raters. The need for adjustment found with the University of Florida data matches the need for this type of adjustment found at the University of Iowa and the University of Toronto Thus, this adjustment for raters' leniency/severity appears to be a general finding for rater/ratee routine evaluations.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/746c/12276026/cb3888ff3859/cureus-0017-00000086366-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/746c/12276026/cb3888ff3859/cureus-0017-00000086366-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/746c/12276026/cb3888ff3859/cureus-0017-00000086366-i01.jpg
摘要

背景 美国研究生医学教育认证委员会(ACGME)要求所有项目每年对教员绩效进行评估。多所大学要求对所有教员进行年度审查。这些高风险评估应该是可靠的。当称一位麻醉医生比另一位表现更好时,既不应频繁出现I型错误(即当麻醉医生的表现处于平均水平时,却被判定表现优于或劣于平均水平),也不应出现II型错误(即未能检测出高于或低于平均水平的表现)。我们调查了如下发现的普遍性:如果不对评分者的宽松/严格程度进行调整,结果在统计学上将不可靠。方法 在2018 - 19学年的周一发送了佛罗里达大学的11项评估。108名被评估者(麻醉医生)接受了85名评估者(住院医师)的3302次评估。通过与多伦多大学和爱荷华大学先前发表的结果进行比较,评估结果的可重复性。结果 正如在多伦多大学所观察到的,评估者之间分数的异质性大于被评估者之间的异质性(评估者的eta平方为0.40;被评估者的为0.22)。正如在爱荷华大学所观察到的,佛罗里达评估者分数的宽松/严格程度不能基于正态分布进行有效建模,因为在评估次数≤9次的75名评估者中,每个评估者均值在评估者中的分布不是正态分布(Shapiro - Wilk W = 0.90(P = 0.00002))。同样,与爱荷华大学情况相符,在评估次数≤9次的94名被评估者中,佛罗里达每个被评估者均值在被评估者中的分布也不是正态分布(W = 0.91(P = 0.00001))。相比之下,将所有项目都得最高分的评估视为值为1,否则为0。至于爱荷华大学,佛罗里达相应的对数概率分布是正态分布(评估者中W = 0.99(P = 0.90),被评估者中W = 0.98(P = 0.09))。在对数尺度上,评分者的宽松/严格程度仍然很大,组内相关系数为0.55。在原始尺度上,按照P < 0.01的标准,108名被评估者中没有人与4.63的总体均值有显著差异。替代分析方法对评估者的宽松/严格程度进行了调整。7名被评估者显著低于平均水平(P≤0.0048),17名高于平均水平(P≤0.0086)。由于满足统计假设,原始尺度上的分析有22%(24/108)的假阴性率,与先前在爱荷华大学观察到的21%类似。结论 麻醉学住院医师评估者对麻醉科教员被评估者的常规评估得出的统计结果不可靠,会错误地对表现进行分类,除非对评估者的协变量进行调整分析。佛罗里达大学数据中发现的调整需求与爱荷华大学和多伦多大学发现的此类调整需求相符。因此,对评估者宽松/严格程度的这种调整似乎是评估者/被评估者常规评估的一个普遍发现。

相似文献

1
Adjusting for Resident Rater Leniency or Severity Improves the Reliability of Routine Resident Evaluations of Faculty Anesthesiologists.调整住院医师评分的宽松或严格程度可提高麻醉科医师常规住院医师评估的可靠性。
Cureus. 2025 Jun 19;17(6):e86366. doi: 10.7759/cureus.86366. eCollection 2025 Jun.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
4
Sexual Harassment and Prevention Training性骚扰与预防培训
5
Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.降低男男性行为者中艾滋病毒性传播风险的行为干预措施。
Cochrane Database Syst Rev. 2008 Jul 16(3):CD001230. doi: 10.1002/14651858.CD001230.pub2.
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。
Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.
8
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
9
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
10
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.

本文引用的文献

1
Few Anesthesia-Related Adverse Events in a Retrospective Cohort Study of Patients With Unanticipated Intensive Care Unit Admission After Ambulatory Procedures.回顾性队列研究:择期门诊手术后 ICU 非预期转入患者的麻醉相关不良事件较少。
A A Pract. 2024 Aug 23;18(8):e01841. doi: 10.1213/XAA.0000000000001841. eCollection 2024 Aug 1.
2
Predictive Validity of Anesthesiologists' Quality of Clinical Supervision and Nurse Anesthetists' Work Habits Assessed by Their Associations With Operating Room Times.通过与手术室时间的关联评估麻醉医生临床监督质量和麻醉护士工作习惯的预测效度
Anesth Analg. 2025 Mar 1;140(3):723-731. doi: 10.1213/ANE.0000000000007076. Epub 2024 Jul 11.
3
Patient and Operational Factors Do Not Substantively Affect the Annual Departmental Quality of Anesthesiologists' Clinical Supervision and Nurse Anesthetists' Work Habits.
患者因素和操作因素对麻醉医生临床监督的年度科室质量及麻醉护士的工作习惯没有实质性影响。
Cureus. 2024 Mar 1;16(3):e55346. doi: 10.7759/cureus.55346. eCollection 2024 Mar.
4
Lack of Benefit of Adjusting Adaptively Daily Invitations for the Evaluation of the Quality of Anesthesiologists' Supervision and Nurse Anesthetists' Work Habits.调整每日适应性邀请对评估麻醉医生监督质量和麻醉护士工作习惯缺乏益处。
Cureus. 2023 Nov 29;15(11):e49661. doi: 10.7759/cureus.49661. eCollection 2023 Nov.
5
The influence of resident and faculty gender on assessments in anesthesia competency-based medical education.住院医师和带教老师的性别对基于能力的麻醉医学教育评估的影响。
Can J Anaesth. 2023 Jun;70(6):978-987. doi: 10.1007/s12630-023-02454-x. Epub 2023 May 10.
6
Overall anesthesia department quality of clinical supervision of trainees over a year evaluated using mixed effects models.使用混合效应模型对麻醉科一年来对学员的临床监督总体质量进行评估。
J Clin Anesth. 2023 Aug;87:111114. doi: 10.1016/j.jclinane.2023.111114. Epub 2023 Mar 31.
7
Association between leniency of anesthesiologists when evaluating certified registered nurse anesthetists and when evaluating didactic lectures.麻醉师在评估认证注册护士麻醉师和评估教学讲座时的宽容度之间的关联。
Health Care Manag Sci. 2020 Dec;23(4):640-648. doi: 10.1007/s10729-020-09518-0. Epub 2020 Sep 18.
8
Patient satisfaction survey scores are not an appropriate metric to differentiate performance among anesthesiologists.患者满意度调查评分并不是区分麻醉师绩效的适当指标。
J Clin Anesth. 2020 Oct;65:109814. doi: 10.1016/j.jclinane.2020.109814. Epub 2020 May 7.
9
Reliability and Validity of Performance Evaluations of Pain Medicine Clinical Faculty by Residents and Fellows Using a Supervision Scale.住院医师和研究员使用监督量表对疼痛医学临床教师进行绩效评估的可靠性和有效性。
Anesth Analg. 2020 Sep;131(3):909-916. doi: 10.1213/ANE.0000000000004779.
10
Reliability of ranking anesthesiologists and nurse anesthetists using leniency-adjusted clinical supervision and work habits scores.使用宽容调整后的临床监督和工作习惯评分对麻醉医生和麻醉护士进行排名的可靠性。
J Clin Anesth. 2020 May;61:109639. doi: 10.1016/j.jclinane.2019.109639. Epub 2019 Nov 15.