• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

本科临床实习中的全球临床表现评分、信度和效度

Global clinical performance rating, reliability and validity in an undergraduate clerkship.

作者信息

Daelmans H E M, van der Hem-Stokroos H H, Hoogenboom R J I, Scherpbier A J J A, Stehouwer C D A, van der Vleuten C P M

机构信息

Department of Skills Training, Vrije Universiteit Medical Centre, Amsterdam, Netherlands.

出版信息

Neth J Med. 2005 Jul-Aug;63(7):279-84.

PMID:16093582
Abstract

BACKGROUND

Global performance rating is frequently used in clinical training despite its known psychometric drawbacks. Inter-rater reliability is low in undergraduate training but better in residency training, possibly because residency offers more opportunities for supervision. The low or moderate predictive validity of global performance ratings in undergraduate and residency training may be due to low or unknown reliability of both global performance ratings and criterion measures. In an undergraduate clerkship, we investigated whether reliability improves when raters are more familiar with students' work and whether validity improves with increased reliability of the predictor and criterion instrument.

METHODS

Inter-rater reliability was determined in a clerkship with more student-rater contacts than usual. The in-training assessment programme of the clerkship that immediately followed was used as the criterion measure to determine predictive validity.

RESULTS

With four ratings, inter-rater reliability was 0.41 and predictive validity was 0.32. Reliability was lower and validity slightly higher than similar results published for residency training.

CONCLUSION

Even with increased student-rater interaction, the reliability and validity of global performance ratings were too low to warrant the usage of global performance ratings as individual assessment format. However, combined with other assessment measures, global performance ratings may lead to improved integral assessment.

摘要

背景

尽管全球绩效评级存在已知的心理测量缺陷,但在临床培训中仍经常使用。本科培训中的评分者间信度较低,但住院医师培训中的信度较好,这可能是因为住院医师培训提供了更多的监督机会。本科和住院医师培训中全球绩效评级的预测效度较低或中等,可能是由于全球绩效评级和标准测量的信度较低或未知。在一次本科临床实习中,我们调查了评分者对学生工作更熟悉时信度是否提高,以及预测指标和标准工具的信度提高时效度是否提高。

方法

在一次学生与评分者接触比平时更多的临床实习中确定评分者间信度。随后立即进行的临床实习培训评估计划用作确定预测效度的标准测量。

结果

有四次评级时,评分者间信度为0.41,预测效度为0.32。信度低于为住院医师培训发表的类似结果,效度略高于类似结果。

结论

即使增加了学生与评分者的互动,全球绩效评级的信度和效度仍过低,无法保证将全球绩效评级用作个人评估形式。然而,与其他评估措施相结合,全球绩效评级可能会改善整体评估。

相似文献

1
Global clinical performance rating, reliability and validity in an undergraduate clerkship.本科临床实习中的全球临床表现评分、信度和效度
Neth J Med. 2005 Jul-Aug;63(7):279-84.
2
Limitations of physician ratings in the assessment of student clinical performance in an obstetrics and gynecology clerkship.在妇产科临床实习中,医师评分在评估学生临床能力方面的局限性。
Obstet Gynecol. 1991 Jul;78(1):136-41.
3
Written case reports as assessment of the elective student clerkship: consistency of central grading and comparison with ratings of clinical performance.作为对选修学生实习的评估的书面病例报告:中央评分的一致性以及与临床绩效评分的比较
Med Teach. 2004 Jun;26(4):301-4. doi: 10.1080/01421590410001683212.
4
Factors in faculty evaluation of medical students' performance.医学院教师评估学生表现的因素。
Med Educ. 2007 Jul;41(7):667-75. doi: 10.1111/j.1365-2923.2007.02787.x.
5
Assessment of neonatal resuscitation skills: a reliable and valid scoring system.新生儿复苏技能评估:一种可靠且有效的评分系统。
Resuscitation. 2006 Nov;71(2):212-21. doi: 10.1016/j.resuscitation.2006.04.009. Epub 2006 Sep 20.
6
Exploring students' perceptions on the use of significant event analysis, as part of a portfolio assessment process in general practice, as a tool for learning how to use reflection in learning.探索学生对将重大事件分析作为全科医疗档案袋评估过程的一部分,作为学习如何在学习中运用反思的一种工具的看法。
BMC Med Educ. 2007 Mar 30;7:5. doi: 10.1186/1472-6920-7-5.
7
Differences in inter-rater reliability and accuracy for a treatment adherence scale.一种治疗依从性量表的评分者间信度和准确性差异。
Cogn Behav Ther. 2007;36(4):230-9. doi: 10.1080/16506070701584367.
8
Communication assessment using the common ground instrument: psychometric properties.使用共同基础工具进行沟通评估:心理测量特性
Fam Med. 2004 Mar;36(3):189-98.
9
A comparison of global rating scale and checklist scores in the validation of an evaluation tool to assess performance in the resuscitation of critically ill patients during simulated emergencies (abbreviated as "CRM simulator study IB").在一项评估工具验证中,对全球评定量表和检查表评分进行比较,该评估工具用于评估模拟紧急情况下危重症患者复苏的表现(简称为“CRM模拟器研究IB”)。
Simul Healthc. 2009 Spring;4(1):6-16. doi: 10.1097/SIH.0b013e3181880472.
10
Increasing inter-rater agreement on a family medicine clerkship oral examination--a pilot study.提高家庭医学实习口试中评分者间的一致性——一项试点研究。
Fam Med. 1993 Mar;25(3):182-5.

引用本文的文献

1
Broadly sampled assessment reduces ethnicity-related differences in clinical grades.广泛采样评估可减少临床分级中与种族相关的差异。
Med Educ. 2019 Mar;53(3):264-275. doi: 10.1111/medu.13790. Epub 2019 Jan 25.
2
Development of an Objective Structured Clinical Examination for Assessment of Clinical Skills in an Emergency Medicine Clerkship.用于评估急诊医学实习临床技能的客观结构化临床考试的开发
West J Emerg Med. 2015 Nov;16(6):866-70. doi: 10.5811/westjem.2015.9.27307. Epub 2015 Oct 22.
3
Feasibility of an internet-based global ranking instrument.
基于互联网的全球排名工具的可行性。
J Grad Med Educ. 2011 Mar;3(1):67-74. doi: 10.4300/JGME-D-10-00162.1.