• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评审员在检测和判断人类与人工智能内容方面的体验:期刊征文比赛。

Reviewer Experience Detecting and Judging Human Versus Artificial Intelligence Content: The Journal Essay Contest.

机构信息

Hospital Israelita Albert Einstein and Departamento de Neurologia e Neurocirurgia, Universidade Federal de São Paulo, Brazil (G.S.S.).

Section of Cardiovascular Medicine (R.K.), Yale School of Medicine, New Haven, CT.

出版信息

Stroke. 2024 Oct;55(10):2573-2578. doi: 10.1161/STROKEAHA.124.045012. Epub 2024 Sep 3.

DOI:10.1161/STROKEAHA.124.045012
PMID:39224979
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11529699/
Abstract

Artificial intelligence (AI) large language models (LLMs) now produce human-like general text and images. LLMs' ability to generate persuasive scientific essays that undergo evaluation under traditional peer review has not been systematically studied. To measure perceptions of quality and the nature of authorship, we conducted a competitive essay contest in 2024 with both human and AI participants. Human authors and 4 distinct LLMs generated essays on controversial topics in stroke care and outcomes research. A panel of Editorial Board members (mostly vascular neurologists), blinded to author identity and with varying levels of AI expertise, rated the essays for quality, persuasiveness, best in topic, and author type. Among 34 submissions (22 human and 12 LLM) scored by 38 reviewers, human and AI essays received mostly similar ratings, though AI essays were rated higher for composition quality. Author type was accurately identified only 50% of the time, with prior LLM experience associated with improved accuracy. In multivariable analyses adjusted for author attributes and essay quality, only persuasiveness was independently associated with odds of a reviewer assigning AI as author type (adjusted odds ratio, 1.53 [95% CI, 1.09-2.16]; =0.01). In conclusion, a group of experienced editorial board members struggled to distinguish human versus AI authorship, with a bias against best in topic for essays judged to be AI generated. Scientific journals may benefit from educating reviewers on the types and uses of AI in scientific writing and developing thoughtful policies on the appropriate use of AI in authoring manuscripts.

摘要

人工智能(AI)大型语言模型(LLM)现在可以生成类似人类的通用文本和图像。尚未系统地研究 LLM 生成具有说服力的科学论文的能力,这些论文在传统同行评审下进行评估。为了衡量质量感知和作者身份的性质,我们在 2024 年举办了一场具有人类和 AI 参与者的竞争性征文比赛。人类作者和 4 种不同的 LLM 生成了中风护理和结果研究中具有争议性主题的论文。由编辑委员会成员(主要是血管神经病学家)组成的小组对论文进行了质量、说服力、最佳主题和作者类型的评估,他们对作者身份和 AI 专业知识的了解程度不一。在 38 名评审员对 34 份(22 份人类和 12 份 LLM)的评分中,人类和 AI 论文的评分大多相似,尽管 AI 论文的作文质量评分更高。只有 50%的时间准确识别了作者类型,而具有 LLM 经验与提高准确性相关。在调整了作者属性和论文质量的多变量分析中,只有说服力与 reviewer 将 AI 分配为作者类型的可能性独立相关(调整后的优势比,1.53 [95%CI,1.09-2.16];=0.01)。总之,一组经验丰富的编辑委员会成员难以区分人类与 AI 作者身份,对于被认为是 AI 生成的论文,他们对最佳主题存在偏见。科学期刊可能受益于教育评审员关于 AI 在科学写作中的类型和用途,并制定关于在撰写手稿中适当使用 AI 的深思熟虑的政策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8613/11529699/7f07eaf43b7b/nihms-2017015-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8613/11529699/7f07eaf43b7b/nihms-2017015-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8613/11529699/7f07eaf43b7b/nihms-2017015-f0001.jpg

相似文献

1
Reviewer Experience Detecting and Judging Human Versus Artificial Intelligence Content: The Journal Essay Contest.评审员在检测和判断人类与人工智能内容方面的体验:期刊征文比赛。
Stroke. 2024 Oct;55(10):2573-2578. doi: 10.1161/STROKEAHA.124.045012. Epub 2024 Sep 3.
2
Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.住院医师申请选拔委员会识别人工智能生成个人陈述的歧视能力。
J Surg Educ. 2024 Jun;81(6):780-785. doi: 10.1016/j.jsurg.2024.02.009. Epub 2024 Apr 27.
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。
Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.
5
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
6
Do peer reviewers comment on reporting items as instructed by the journal? A secondary analysis of two randomized trials.同行评审员是否按照期刊的要求对报告项目进行评论?两项随机试验的二次分析。
J Clin Epidemiol. 2025 May 8;183:111818. doi: 10.1016/j.jclinepi.2025.111818.
7
Artificial intelligence for detecting keratoconus.人工智能在圆锥角膜检测中的应用。
Cochrane Database Syst Rev. 2023 Nov 15;11(11):CD014911. doi: 10.1002/14651858.CD014911.pub2.
8
Defining the Boundaries of AI Use in Scientific Writing: A Comparative Review of Editorial Policies.界定科学写作中人工智能使用的界限:编辑政策的比较综述
J Korean Med Sci. 2025 Jun 16;40(23):e187. doi: 10.3346/jkms.2025.40.e187.
9
Artificial intelligence policies in bioethics and health humanities: a comparative analysis of publishers and journals.生物伦理学与健康人文学科中的人工智能政策:出版商与期刊的比较分析
BMC Med Ethics. 2025 Jul 3;26(1):79. doi: 10.1186/s12910-025-01239-9.
10
Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them?放射学中人工智能生成的社论:专家编辑能检测出来吗?
AJNR Am J Neuroradiol. 2025 Mar 4;46(3):559-566. doi: 10.3174/ajnr.A8505.

本文引用的文献

1
Utilizing artificial intelligence in academic writing: an in-depth evaluation of a scientific review on fertility preservation written by ChatGPT-4.利用人工智能进行学术写作:对 ChatGPT-4 撰写的关于生育力保存的科学综述的深入评估。
J Assist Reprod Genet. 2024 Jul;41(7):1871-1880. doi: 10.1007/s10815-024-03089-7. Epub 2024 Apr 15.
2
Reporting Use of AI in Research and Scholarly Publication-JAMA Network Guidance.《研究与学术出版中人工智能的报告——美国医学会杂志网络指南》
JAMA. 2024 Apr 2;331(13):1096-1098. doi: 10.1001/jama.2024.3471.
3
A review of top cardiology and cardiovascular medicine journal guidelines regarding the use of generative artificial intelligence tools in scientific writing.
关于在科学写作中使用生成式人工智能工具的顶级心脏病学和心血管医学期刊指南综述。
Curr Probl Cardiol. 2024 Mar;49(3):102387. doi: 10.1016/j.cpcardiol.2024.102387. Epub 2024 Jan 5.
4
ChatGPT one year on: who is using it, how and why?ChatGPT问世一年:谁在使用它、如何使用以及为何使用?
Nature. 2023 Dec;624(7990):39-41. doi: 10.1038/d41586-023-03798-6.
5
The importance of transparency: Declaring the use of generative artificial intelligence (AI) in academic writing.透明度的重要性:在学术写作中声明使用生成式人工智能(AI)。
J Nurs Scholarsh. 2024 Mar;56(2):314-318. doi: 10.1111/jnu.12938. Epub 2023 Oct 31.
6
The future landscape of large language models in medicine.医学领域大语言模型的未来前景。
Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.
7
ChatGPT and the Future of Journal Reviews: A Feasibility Study.ChatGPT 与期刊评审的未来:一项可行性研究。
Yale J Biol Med. 2023 Sep 29;96(3):415-420. doi: 10.59249/SKDH9286. eCollection 2023 Sep.
8
Best Practices for Using AI Tools as an Author, Peer Reviewer, or Editor.使用人工智能工具作为作者、同行评审员或编辑的最佳实践。
J Med Internet Res. 2023 Aug 31;25:e51584. doi: 10.2196/51584.
9
Large language models encode clinical knowledge.大语言模型编码临床知识。
Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.
10
Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.人工智能可以生成虚假但看起来真实的科学医学文章:潘多拉的盒子已经被打开。
J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.