• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数字墨水与手术梦想:对人工智能生成论文在住院医师申请中的看法。

Digital Ink and Surgical Dreams: Perceptions of Artificial Intelligence-Generated Essays in Residency Applications.

机构信息

Department of Biomedical Engineering, University of Rochester, Rochester, New York.

School of Medicine and Dentistry, University of Rochester, Rochester, New York.

出版信息

J Surg Res. 2024 Sep;301:504-511. doi: 10.1016/j.jss.2024.06.020. Epub 2024 Jul 22.

DOI:10.1016/j.jss.2024.06.020
PMID:39042979
Abstract

INTRODUCTION

Large language models like Chat Generative Pre-Trained Transformer (ChatGPT) are increasingly used in academic writing. Faculty may consider use of artificial intelligence (AI)-generated responses a form of cheating. We sought to determine whether general surgery residency faculty could detect AI versus human-written responses to a text prompt; hypothesizing that faculty would not be able to reliably differentiate AI versus human-written responses.

METHODS

Ten essays were generated using a text prompt, "Tell us in 1-2 paragraphs why you are considering the University of Rochester for General Surgery residency" (Current trainees: n = 5, ChatGPT: n = 5). Ten blinded faculty reviewers rated essays (ten-point Likert scale) on the following criteria: desire to interview, relevance to the general surgery residency, overall impression, and AI- or human-generated; with scores and identification error rates compared between the groups.

RESULTS

There were no differences between groups for %total points (ChatGPT 66.0 ± 13.5%, human 70.0 ± 23.0%, P = 0.508) or identification error rates (ChatGPT 40.0 ± 35.0%, human 20.0 ± 30.0%, P = 0.175). Except for one, all essays were identified incorrectly by at least two reviewers. Essays identified as human-generated received higher overall impression scores (area under the curve: 0.82 ± 0.04, P < 0.01).

CONCLUSIONS

Whether use of AI tools for academic purposes should constitute academic dishonesty is controversial. We demonstrate that human and AI-generated essays are similar in quality, but there is bias against presumed AI-generated essays. Faculty are not able to reliably differentiate human from AI-generated essays, thus bias may be misdirected. AI-tools are becoming ubiquitous and their use is not easily detected. Faculty must expect these tools to play increasing roles in medical education.

摘要

引言

像 Chat Generative Pre-Trained Transformer(ChatGPT)这样的大型语言模型越来越多地用于学术写作。教师可能会将人工智能(AI)生成的回复视为一种作弊形式。我们旨在确定普通外科住院医师教师是否能够检测到文本提示的 AI 与人工书写的回复;假设教师无法可靠地区分 AI 与人工书写的回复。

方法

使用文本提示“用 1-2 段话告诉我们为什么您考虑在罗切斯特大学攻读普通外科住院医师”生成了 10 篇文章(现任学员:n=5,ChatGPT:n=5)。十名盲审教师根据以下标准对文章进行评分(十分制李克特量表):面试意愿、与普通外科住院医师相关度、整体印象和 AI 或人工生成;并比较两组的分数和识别错误率。

结果

在总分百分比(ChatGPT 66.0±13.5%,人类 70.0±23.0%,P=0.508)或识别错误率(ChatGPT 40.0±35.0%,人类 20.0±30.0%,P=0.175)方面,两组之间没有差异。除了一篇文章外,所有文章都至少被两名审稿人错误识别。被识别为人工生成的文章获得了更高的整体印象评分(曲线下面积:0.82±0.04,P<0.01)。

结论

是否应将 AI 工具用于学术目的视为学术不诚实是有争议的。我们表明,AI 生成的论文和人工撰写的论文在质量上相似,但对假定的 AI 生成的论文存在偏见。教师无法可靠地区分人工和 AI 生成的论文,因此这种偏见可能是错误的。AI 工具变得无处不在,其使用也不容易被发现。教师必须期望这些工具在医学教育中发挥越来越重要的作用。

相似文献

1
Digital Ink and Surgical Dreams: Perceptions of Artificial Intelligence-Generated Essays in Residency Applications.数字墨水与手术梦想:对人工智能生成论文在住院医师申请中的看法。
J Surg Res. 2024 Sep;301:504-511. doi: 10.1016/j.jss.2024.06.020. Epub 2024 Jul 22.
2
Distinguishing Authentic Voices in the Age of ChatGPT: Comparing AI-Generated and Applicant-Written Personal Statements for Plastic Surgery Residency Application.ChatGPT 时代的真实声音辨别:比较整形外科学住院医师申请的 AI 生成和申请人撰写的个人陈述。
Ann Plast Surg. 2023 Sep 1;91(3):324-325. doi: 10.1097/SAP.0000000000003653.
3
Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.住院医师申请选拔委员会识别人工智能生成个人陈述的歧视能力。
J Surg Educ. 2024 Jun;81(6):780-785. doi: 10.1016/j.jsurg.2024.02.009. Epub 2024 Apr 27.
4
Comparison of Medical Research Abstracts Written by Surgical Trainees and Senior Surgeons or Generated by Large Language Models.外科住院医师和资深外科医生撰写的医学研究摘要与大型语言模型生成的摘要的比较。
JAMA Netw Open. 2024 Aug 1;7(8):e2425373. doi: 10.1001/jamanetworkopen.2024.25373.
5
A large-scale comparison of human-written versus ChatGPT-generated essays.人工撰写与ChatGPT生成的文章的大规模比较。
Sci Rep. 2023 Oct 30;13(1):18617. doi: 10.1038/s41598-023-45644-9.
6
Human vs machine: identifying ChatGPT-generated abstracts in Gynecology and Urogynecology.人机之争:在妇科和泌尿外科学中识别 ChatGPT 生成的摘要。
Am J Obstet Gynecol. 2024 Aug;231(2):276.e1-276.e10. doi: 10.1016/j.ajog.2024.04.045. Epub 2024 May 6.
7
Reviewer Experience Detecting and Judging Human Versus Artificial Intelligence Content: The Journal Essay Contest.评审员在检测和判断人类与人工智能内容方面的体验:期刊征文比赛。
Stroke. 2024 Oct;55(10):2573-2578. doi: 10.1161/STROKEAHA.124.045012. Epub 2024 Sep 3.
8
Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis.评估 ChatGPT 和 Bard 生成的结构化摘要与脊柱外科领域人类撰写的摘要在可重复性方面的比较:对比分析。
J Med Internet Res. 2024 Jun 26;26:e52001. doi: 10.2196/52001.
9
AI language models in human reproduction research: exploring ChatGPT's potential to assist academic writing.人工智能语言模型在人类生殖研究中的应用:探索 ChatGPT 在辅助学术写作方面的潜力。
Hum Reprod. 2023 Dec 4;38(12):2281-2288. doi: 10.1093/humrep/dead207.
10
The Revival of Essay-Type Questions in Medical Education: Harnessing Artificial Intelligence and Machine Learning.医学教育中论文型问题的复兴:利用人工智能和机器学习。
J Coll Physicians Surg Pak. 2024 May;34(5):595-599. doi: 10.29271/jcpsp.2024.05.595.

引用本文的文献

1
Artificial Intelligence vs Human Authorship in Spine Surgery Fellowship Personal Statements: Can ChatGPT Outperform Applicants?脊柱外科住院医师个人陈述中的人工智能与人类撰写:ChatGPT 能比申请者表现更出色吗?
Global Spine J. 2025 May 20:21925682251344248. doi: 10.1177/21925682251344248.
2
Comment on "Artificial Intelligence-Generated Writing in the ERAS Personal Statement: An Emerging Quandary for Post-Graduate Medical Education".对《电子住院医师申请服务(ERAS)个人陈述中的人工智能生成写作:研究生医学教育中的一个新难题》的评论
Acad Psychiatry. 2025 Apr;49(2):200-201. doi: 10.1007/s40596-025-02123-9. Epub 2025 Feb 18.
3
Artificial Intelligence in Medical Writing: Addressing Untouched Threats.
医学写作中的人工智能:应对未被触及的威胁。
JMA J. 2025 Jan 15;8(1):273-275. doi: 10.31662/jmaj.2024-0268. Epub 2024 Dec 6.