• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

住院医师申请选拔委员会识别人工智能生成个人陈述的歧视能力。

Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.

机构信息

Department of Surgery, Community Medical Center, RWJ/Barnabas Health, Tom's River, New Jersey.

Department of Surgery, Robert Wood Johnson Medical School, New Brunswick, New Jersey.

出版信息

J Surg Educ. 2024 Jun;81(6):780-785. doi: 10.1016/j.jsurg.2024.02.009. Epub 2024 Apr 27.

DOI:10.1016/j.jsurg.2024.02.009
PMID:38679494
Abstract

OBJECTIVE

Advances in artificial intelligence (AI) have given rise to sophisticated algorithms capable of generating human-like text. The goal of this study was to evaluate the ability of human reviewers to reliably differentiate personal statements (PS) written by human authors from those generated by AI software.

SETTING

Four personal statements from the archives of two surgical program directors were de-identified and used as the human samples. Two AI platforms were used to generate nine additional PS.

PARTICIPANTS

Four surgeons from the residency selection committees of two surgical residency programs of a large multihospital system served as blinded reviewers. AI was also asked to evaluate each PS sample for authorship.

DESIGN

Sensitivity, specificity and accuracy of the reviewers in identifying the PS author were calculated. Kappa statistic for correlation between the hypothesized author and the true author were calculated. Inter-rater reliability was calculated using the kappa statistic with Light's modification given more than two reviewers in a fully-crossed design. Logistic regression was performed with to model the impact of perceived creativity, writing quality, and authorship or the likelihood of offering an interview.

RESULTS

Human reviewer sensitivity for identifying an AI-generated PS was 0.87 with specificity of 0.37 and overall accuracy of 0.55. The level of agreement by kappa statistic of the reviewer estimate of authorship and the true authorship was 0.19 (slight agreement). The reviewers themselves had an inter-rater reliability of 0.067 (poor), with only complete agreement (four out of four reviewers) on two PS, both authored by humans. The odds ratio of offering an interview (compared to a composite of "backup" status or no interview) to a perceived human author was 7 times that of a perceived AI author (95% confidence interval 1.5276 to 32.0758, p=0.0144). AI hypothesized human authorship for twelve of the PS, with the last one "unsure."

CONCLUSIONS

The increasing pervasiveness of AI will have far-reaching effects including on the resident application and recruitment process. Identifying AI-generated personal statements is exceedingly difficult. With the decreasing availability of objective data to assess applicants, a review and potential restructuring of the approach to resident recruitment may be warranted.

摘要

目的

人工智能(AI)的进步催生了能够生成类人文本的复杂算法。本研究的目的是评估人类评审员可靠地区分由人类作者撰写的个人陈述(PS)与由 AI 软件生成的 PS 的能力。

背景

两个外科项目主任档案中的四个个人陈述被去识别,并用作人类样本。使用两个 AI 平台生成了另外九个 PS。

参与者

来自大型多医院系统两个外科住院医师计划的住院医师选拔委员会的四名外科医生作为盲审员。AI 还被要求评估每个 PS 样本的作者身份。

设计

计算评审员识别 PS 作者的敏感性、特异性和准确性。计算假设作者与真实作者之间相关性的 Kappa 统计量。对于完全交叉设计中超过两名评审员的情况,使用 Light 修改后的 Kappa 统计量计算评分者间可靠性。使用逻辑回归来模拟感知创造力、写作质量和作者身份或提供面试的可能性对接受面试的可能性的影响。

结果

人类评审员识别 AI 生成 PS 的敏感性为 0.87,特异性为 0.37,整体准确性为 0.55。评审员对作者身份的估计与真实作者身份的 Kappa 统计量的一致性水平为 0.19(轻度一致)。评审员本身的评分者间可靠性为 0.067(差),只有在两个 PS 上完全一致(四名评审员中的四名),这两个 PS 都是由人类撰写的。与“候补”状态或无面试相比,接受访谈的可能性(相对于复合)的优势比为感知到的 AI 作者的 7 倍(95%置信区间 1.5276 至 32.0758,p=0.0144)。AI 假设十二篇 PS 的作者是人类,最后一篇是“不确定”。

结论

AI 的普及将产生深远的影响,包括对住院医师申请和招聘过程的影响。识别 AI 生成的个人陈述极其困难。随着评估申请人的客观数据可用性的降低,可能需要对住院医师招聘的方法进行审查和潜在的重组。

相似文献

1
Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.住院医师申请选拔委员会识别人工智能生成个人陈述的歧视能力。
J Surg Educ. 2024 Jun;81(6):780-785. doi: 10.1016/j.jsurg.2024.02.009. Epub 2024 Apr 27.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Can Artificial Intelligence Deceive Residency Committees? A Randomized Multicenter Analysis of Letters of Recommendation.人工智能会欺骗住院医师委员会吗?推荐信的随机多中心分析
J Am Acad Orthop Surg. 2025 Mar 15;33(6):e348-e355. doi: 10.5435/JAAOS-D-24-00438. Epub 2024 Dec 12.
4
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
5
Antibody tests for identification of current and past infection with SARS-CoV-2.抗体检测用于鉴定 SARS-CoV-2 的现症感染和既往感染。
Cochrane Database Syst Rev. 2022 Nov 17;11(11):CD013652. doi: 10.1002/14651858.CD013652.pub2.
6
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
7
Interventions for patients and caregivers to improve knowledge of sickle cell disease and recognition of its related complications.针对患者及护理人员的干预措施,以提高对镰状细胞病的认识及其相关并发症的识别能力。
Cochrane Database Syst Rev. 2016 Oct 6;10(10):CD011175. doi: 10.1002/14651858.CD011175.pub2.
8
Artificial intelligence for detecting keratoconus.人工智能在圆锥角膜检测中的应用。
Cochrane Database Syst Rev. 2023 Nov 15;11(11):CD014911. doi: 10.1002/14651858.CD014911.pub2.
9
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
10
Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。
Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

引用本文的文献

1
Applications of Artificial Intelligence in Medical Education: A Systematic Review.人工智能在医学教育中的应用:一项系统综述。
Cureus. 2025 Mar 1;17(3):e79878. doi: 10.7759/cureus.79878. eCollection 2025 Mar.