Brigham and Women's Hospital, Boston, MA 02115, USA.
Acad Med. 2012 Aug;87(8):1070-6. doi: 10.1097/ACM.0b013e31825d0a2a.
Despite standardized curricula and mandated accreditation, concern exists regarding the variability and imprecision of medical student evaluation. The authors set out to perform a complete review of clerkship evaluation in U.S. medical schools.
Clerkship evaluation data were obtained from all Association of American Medical Colleges-affiliated medical schools reporting enrollment during 2009-2010. Deidentified reports were analyzed to define the grading system and the percentage of each class within each grading tier. Inter- and intraschool grading variation was assessed in part by comparing the proportion of students receiving the top grade.
Data were analyzed from 119 of 123 accredited medical schools. Dramatic variation was detected. Specifically, the authors documented eight different grading systems using 27 unique sets of descriptive terminology. Imprecision of grading was apparent. Institutions frequently used the same wording (e.g., "honors") to imply different meanings. The percentage of students awarded the top grade in any clerkship exhibited extreme variability (range 2%-93%) from school to school, as well as from clerkship to clerkship within the same school (range 18%-81%). Ninety-seven percent of all U.S. clerkship students were awarded one of the top three grades regardless of the number of grading tiers. Nationally, less than 1% of students failed any required clerkship.
There exists great heterogeneity of grading systems and imprecision of grade meaning throughout the U.S. medical education system. Systematic changes seeking to increase consistency, transparency, and reliability of grade meaning are needed to improve the student evaluation process at the national level.
尽管有标准化的课程和强制性的认证,但仍存在对医学生评估的变异性和不准确性的担忧。作者着手对美国医学院的实习评估进行全面审查。
从所有在 2009-2010 年期间报告入学人数的美国医学协会附属医学院获得实习评估数据。对匿名报告进行分析,以确定评分系统和每个等级内每个班级的百分比。通过比较获得最高分的学生比例,部分评估了学校间和校内的评分差异。
对 123 所认可医学院中的 119 所进行了数据分析。发现了明显的差异。具体来说,作者记录了八种不同的评分系统,使用了 27 套独特的描述性术语。评分的不准确性很明显。机构经常使用相同的措辞(例如,“荣誉”)来表示不同的含义。任何实习中获得最高分的学生比例在学校之间以及同一学校内的实习之间都存在极大的差异(范围为 2%-93%)。97%的美国实习学生无论评分等级数量多少,都获得了前三个等级中的一个。全国范围内,不到 1%的学生未能通过任何必修实习。
在美国医学教育系统中,评分系统存在很大的异质性和评分含义的不准确性。需要进行系统的变革,以提高成绩含义的一致性、透明度和可靠性,从而在全国范围内改进学生评估过程。