Gordon Emile B, Maxfield Charles M, French Robert, Fish Laura J, Romm Jacob, Barre Emily, Kinne Erica, Peterson Ryan, Grimm Lars J
Department of Radiology, Duke University Health System, Durham, North Carolina; Department of Radiology, University of California San Diego, La Jolla, California.
Department of Radiology, Duke University Health System, Durham, North Carolina.
J Am Coll Radiol. 2025 Jan;22(1):33-40. doi: 10.1016/j.jacr.2024.08.027. Epub 2024 Sep 17.
This study explores radiology program directors' perspectives on the impact of large language model (LLM) use among residency applicants to craft personal statements.
Eight program directors from the Radiology Residency Education Research Alliance participated in a mixed-methods study, which included a survey regarding impressions of artificial intelligence (AI)-generated personal statements and focus group discussions (July 2023). Each director reviewed four personal statement variations for five applicants, anonymized to author type: the original and three Chat Generative Pre-trained Transformer-4.0 (GPT) versions generated with varying prompts, aggregated for analysis. A 5-point Likert scale surveyed the writing quality, including voice, clarity, engagement, organization, and perceived origin of each statement. An experienced qualitative researcher facilitated focus group discussions. Data analysis was performed using a rapid analytic approach with a coding template capturing key areas related to residency applications.
GPT-generated statement ratings were more often average or worse in quality (56%, 268 of 475) than ratings of human-authored statements (29%, 45 of 160). Although reviewers were not confident in their ability to distinguish the origin of personal statements, they did so reliably and consistently, identifying the human-authored personal statements at 95% (38 of 40) as probably or definitely original. Focus group discussions highlighted the inevitable use of AI in crafting personal statements and concerns about its impact on the authenticity and the value of the personal statement in residency selections. Program directors were divided on the appropriate use and regulation of AI.
Radiology residency program directors rated LLM-generated personal statements as lower in quality and expressed concern about the loss of the applicant's voice but acknowledged the inevitability of increased AI use in the generation of application statements.
本研究探讨放射科项目主任对住院医师申请人使用大语言模型(LLM)撰写个人陈述的影响的看法。
放射科住院医师教育研究联盟的八位项目主任参与了一项混合方法研究,其中包括一项关于对人工智能(AI)生成的个人陈述的印象的调查以及焦点小组讨论(2023年7月)。每位主任审阅了五名申请人的四份个人陈述变体,按作者类型进行了匿名处理:原始陈述以及使用不同提示生成的三个Chat生成式预训练变换器4.0(GPT)版本,汇总后进行分析。采用5点李克特量表对写作质量进行调查,包括每份陈述的语气、清晰度、吸引力、组织以及感知来源。一名经验丰富的定性研究人员主持焦点小组讨论。使用快速分析方法进行数据分析,采用编码模板捕捉与住院医师申请相关的关键领域。
GPT生成的陈述评分质量通常为中等或较差(56%,475份中有268份),低于人工撰写的陈述评分(29%,160份中有45份)。尽管审阅者对区分个人陈述的来源能力没有信心,但他们能够可靠且一致地做到这一点,将95%(40份中有38份)的人工撰写的个人陈述确定为可能或肯定是原创的。焦点小组讨论强调了在撰写个人陈述时不可避免地会使用人工智能,以及对其对住院医师选拔中个人陈述的真实性和价值的影响的担忧。项目主任在人工智能的适当使用和监管问题上存在分歧。
放射科住院医师项目主任将LLM生成的个人陈述评为质量较低,并对申请人声音的缺失表示担忧,但承认在生成申请陈述中增加人工智能使用的必然性。