Cumbo Nicole, Williams Whitney, Canterino Joseph C, Aikman Noelle, Baum Jonathan D
Obstetrics and Gynecology, Hackensack Meridian Jersey Shore University Medical Center, Neptune, USA.
Obstetrics and Gynecology, St George's University School of Medicine, St George's, GRD.
Cureus. 2025 Jul 29;17(7):e88969. doi: 10.7759/cureus.88969. eCollection 2025 Jul.
To evaluate widely used artificial intelligence (AI) detectors' ability to identify ChatGPT's (OpenAI, San Francisco, CA, USA) use in personal statements submitted as part of the residency program application.
This qualitative analysis was performed to evaluate the ability of three different AI detectors to detect the use of AI in personal statements submitted as part of residency applications for obstetrics and gynecology. A total of 25 writings were selected and analyzed by GPTZero (Princeton, NJ, USA), Undetectable AI (Sheridan, WY, USA), and Winston AI (Montreal, Quebec, Canada).
In total, 25 separate writing samples of approximately 700 words were entered into three different AI detectors. AI-generated works had high rates of AI-detection, while classic literature samples had low rates of detection. Human-written personal statements before and after the availability of ChatGPT technology results were mixed, with results ranging from 64-100% and 3-100% of content appearing to be AI, respectively.
AI-chatbots have been shown to produce writing that may be indistinguishable from human work and may already be commonly used to create personal statements. It is unclear who is utilizing ChatGPT in their writing, and residency programs everywhere will seek a reliable way to detect unethical usage. This study shows that available AI detectors may be able to detect AI use in applicants' personal statements, but the use of invalidated tools may harm honest applicants.
Residency programs may be able to detect AI use in personal statements by utilizing AI-detection tools. Clear guidelines regarding the appropriate use of AI and authorship must be developed in order to maintain the integrity of student submissions.
评估广泛使用的人工智能(AI)检测工具识别在住院医师项目申请个人陈述中使用ChatGPT(美国加利福尼亚州旧金山OpenAI公司)的能力。
进行这项定性分析,以评估三种不同的AI检测工具检测在妇产科住院医师申请个人陈述中使用AI的能力。总共选择了25篇文章,由GPTZero(美国新泽西州普林斯顿)、Undetectable AI(美国怀俄明州谢里登)和Winston AI(加拿大魁北克省蒙特利尔)进行分析。
总共将25篇约700字的独立写作样本输入三种不同的AI检测工具。人工智能生成的作品被AI检测到的比例很高,而经典文学样本的检测比例很低。在ChatGPT技术出现之前和之后由人类撰写的个人陈述结果不一,分别有64%-100%和3%-100%的内容似乎是由AI生成的。
人工智能聊天机器人已被证明能生成与人类作品难以区分的文字,而且可能已被普遍用于撰写个人陈述。目前尚不清楚是谁在写作中使用了ChatGPT,各地的住院医师项目都在寻求一种可靠的方法来检测不道德的使用情况。这项研究表明,现有的AI检测工具可能能够检测出申请人个人陈述中使用AI的情况,但使用无效工具可能会伤害诚实的申请人。
住院医师项目或许能够通过使用AI检测工具来检测个人陈述中是否使用了AI。必须制定关于AI和作者身份适当使用的明确指南,以维护学生提交材料的完整性。