Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH.
Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH.
Surgery. 2024 Dec;176(6):1610-1616. doi: 10.1016/j.surg.2024.08.018. Epub 2024 Sep 19.
Use of artificial intelligence to generate personal statements for residency is currently not permitted but is difficult to monitor. This study sought to evaluate the ability of surgical residency application reviewers to identify artificial intelligence-generated personal statements and to understand perceptions of this practice.
Three personal statements were generated using ChatGPT, and 3 were written by medical students who previously matched into surgery residency. Blinded participants at a single institution were instructed to read all personal statements and identify which were generated by artificial intelligence; they then completed a survey exploring their opinions regarding artificial intelligence use.
Of the 30 participants, 50% were faculty (n = 15) and 50% were residents (n = 15). Overall, experience ranged from 0 to 20 years (median, 2 years; interquartile range, 1-6.25 years). Artificial intelligence-derived personal statements were identified correctly only 59% of the time, with 3 (10%) participants identifying all the artificial intelligence-derived personal statements correctly. Artificial intelligence-generated personal statements were labeled as the best 60% of the time and the worst 43.3% of the time. When asked whether artificial intelligence use should be allowed in personal statements writing, 66.7% (n = 20) said no and 30% (n = 9) said yes. When asked if the use of artificial intelligence would impact their opinion of an applicant, 80% (n = 24) said yes, and 20% (n = 6) said no. When survey questions and ability to identify artificial intelligence-generated personal statements were evaluated by faculty/resident status and experience, no differences were noted (P > .05).
This study shows that surgical faculty and residents cannot reliably identify artificial intelligence-generated personal statements and that concerns exist regarding the impact of artificial intelligence on the application process.
目前,使用人工智能生成住院医师个人陈述是不被允许的,但很难进行监控。本研究旨在评估外科住院医师申请审查者识别人工智能生成的个人陈述的能力,并了解他们对这种做法的看法。
使用 ChatGPT 生成了 3 份个人陈述,另外 3 份由之前成功匹配到外科住院医师的医学生撰写。在一个机构中,将 30 名参与者分成两组,一组为学员,一组为住院医师,对他们进行了分组,要求他们阅读所有个人陈述并识别出哪些是由人工智能生成的;然后,他们完成了一份调查,探索他们对人工智能使用的看法。
30 名参与者中,50%为教师(n=15),50%为住院医师(n=15)。总的来说,参与者的经验从 0 到 20 年不等(中位数为 2 年;四分位距为 1-6.25 年)。只有 59%的时间能正确识别出人工智能生成的个人陈述,其中 3 名(10%)参与者正确识别出所有人工智能生成的个人陈述。人工智能生成的个人陈述被评为最好的陈述占 60%,被评为最差的陈述占 43.3%。当被问及是否应该允许在个人陈述写作中使用人工智能时,66.7%(n=20)表示不允许,30%(n=9)表示允许。当被问及人工智能的使用是否会影响他们对申请人的看法时,80%(n=24)表示会,20%(n=6)表示不会。当根据教师/住院医师身份和经验评估调查问题和识别人工智能生成的个人陈述的能力时,没有发现差异(P>.05)。
本研究表明,外科教师和住院医师无法可靠地识别人工智能生成的个人陈述,并且存在对人工智能对申请过程的影响的担忧。