Suppr超能文献

A Comparison of Machine-Graded (ChatGPT) and Human-Graded Essay Scores in Veterinary Admissions.

作者信息

Vanderstichel Raphael, Stryhn Henrik

机构信息

Department of Veterinary Clinical Sciences, College of Veterinary Medicine, Long Island University, Brookville, NY, 11548, USA.

Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE, C1A 4P3, Canada.

出版信息

J Vet Med Educ. 2025 Jun;52(3):372-380. doi: 10.3138/jvme-2023-0162. Epub 2024 May 22.

Abstract

Admissions committees have historically emphasized cognitive measures, but a paradigm shift toward holistic reviews now places greater importance on non-cognitive skills. These holistic reviews may include personal statements, experiences, references, interviews, multiple mini-interviews, and situational judgment tests, often requiring substantial faculty resources. Leveraging advances in artificial intelligence, particularly in natural language processing, this study was conducted to assess the agreement of essay scores graded by both humans and machines (OpenAI's ChatGPT). Correlations were calculated among these scores and cognitive and non-cognitive measures in the admissions process. Human-derived scores from 778 applicants in 2021 and 552 in 2022 had item-specific inter-rater reliabilities ranging from 0.07 to 0.41, while machine-derived inter-replicate reliabilities ranged from 0.41 to 0.61. Pairwise correlations between human- and machine-derived essay scores and other admissions criteria revealed moderate correlations between the two scoring methods (0.41) and fair correlations between the essays and the multiple mini-interview (0.20 and 0.22 for human and machine scores, respectively). Despite having very low correlations, machine-graded scores exhibited slightly stronger correlations with cognitive measures (0.10 to 0.15) compared to human-graded scores (0.01 to 0.02). Importantly, machine scores demonstrated higher precision, approximately two to three times greater than human scores in both years. This study emphasizes the importance of careful item design, rubric development, and prompt formulation when using machine-based essay grading. It also underscores the importance of employing replicates and robust statistical analyses to ensure equitable applicant ranking when integrating machine grading into the admissions process.

摘要

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验