• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT能在科学摘要中识别出自己的写作内容吗?

Can ChatGPT Recognize Its Own Writing in Scientific Abstracts?

作者信息

Sebo Paul

机构信息

Internal Medicine, University Institute for Primary Care, Geneva University Hospital, Geneva, CHE.

出版信息

Cureus. 2025 Jul 25;17(7):e88774. doi: 10.7759/cureus.88774. eCollection 2025 Jul.

DOI:10.7759/cureus.88774
PMID:40861680
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12375800/
Abstract

BACKGROUND

With the growing use of generative AI in scientific writing, distinguishing between AI-generated and human-authored content has become a pressing challenge. It remains unclear whether ChatGPT (OpenAI, San Francisco, CA) can accurately and consistently recognize its own output.

METHODS

We randomly selected 100 research articles published in 2000, before the advent of generative AI, from 10 high-impact internal medicine journals. For each article, a structured abstract was generated using ChatGPT-4.0 based on the full PDF. The original and AI-generated abstracts (n = 200) were then evaluated twice by ChatGPT-4.0, which was asked to rate the likelihood of authorship on a 0-10 scale (0 = definitely human, 10 = definitely ChatGPT, 5 = undetermined). Classifications of 0-4 were considered human, and 6-10 were considered AI generated.

RESULTS

Misclassification rates were high in both rounds (49% and 47.5%). No abstract received a score of 5. Score distributions overlapped substantially between groups, with no statistically significant difference (Wilcoxon p-value = 0.93 and 0.21). Cohen's kappa for binary classification was 0.33 (95% CI: 0.19-0.46) and weighted kappa on the 0-10 scale was 0.24 (95% CI: 0.15-0.34), both reflecting poor agreement.

CONCLUSION

ChatGPT-4.0 cannot reliably identify whether a scientific abstract was written by itself or by humans. More robust external tools are needed to ensure transparency in academic authorship.

摘要

背景

随着生成式人工智能在科学写作中的使用日益增加,区分人工智能生成的内容和人类撰写的内容已成为一项紧迫的挑战。尚不清楚ChatGPT(OpenAI,加利福尼亚州旧金山)是否能够准确且一致地识别其自己的输出。

方法

我们从10种高影响力的内科医学期刊中随机选择了100篇在生成式人工智能出现之前的2000年发表的研究文章。对于每篇文章,基于完整的PDF使用ChatGPT-4.0生成结构化摘要。然后让ChatGPT-4.0对原始摘要和人工智能生成的摘要(共200个)进行两次评估,要求其在0至10分的量表上对作者身份的可能性进行评分(0分 = 肯定是人类撰写,10分 = 肯定是ChatGPT撰写,5分 = 无法确定)。0至4分的分类被视为人类撰写,6至10分的分类被视为人工智能生成。

结果

两轮评估中的错误分类率都很高(分别为49%和47.5%)。没有摘要获得5分。两组之间的分数分布有很大重叠,无统计学显著差异(Wilcoxon p值 = 0.93和0.21)。二元分类的科恩kappa系数为0.33(95%置信区间:0.19 - 0.46),0至10分量表上的加权kappa系数为0.24(95%置信区间:0.15 - 0.34),两者均反映出一致性较差。

结论

ChatGPT-4.0无法可靠地识别一篇科学摘要是由其自身还是由人类撰写的。需要更强大的外部工具来确保学术作者身份的透明度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f936/12375800/6a6c605975ab/cureus-0017-00000088774-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f936/12375800/6a6c605975ab/cureus-0017-00000088774-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f936/12375800/6a6c605975ab/cureus-0017-00000088774-i01.jpg

相似文献

1
Can ChatGPT Recognize Its Own Writing in Scientific Abstracts?ChatGPT能在科学摘要中识别出自己的写作内容吗?
Cureus. 2025 Jul 25;17(7):e88774. doi: 10.7759/cureus.88774. eCollection 2025 Jul.
2
Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis.评估 ChatGPT 和 Bard 生成的结构化摘要与脊柱外科领域人类撰写的摘要在可重复性方面的比较:对比分析。
J Med Internet Res. 2024 Jun 26;26:e52001. doi: 10.2196/52001.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Can we trust academic AI detective? Accuracy and limitations of AI-output detectors.我们能信任学术人工智能侦探吗?人工智能输出检测器的准确性和局限性。
Acta Neurochir (Wien). 2025 Aug 7;167(1):214. doi: 10.1007/s00701-025-06622-4.
5
Can artificial intelligence write science? A comparative analysis of human-written and artificial intelligence-generated scientific writings.人工智能能撰写科学内容吗?人工撰写与人工智能生成的科学著作的比较分析。
J Neurosurg Spine. 2025 Aug 22:1-6. doi: 10.3171/2025.4.SPINE25519.
6
ChatGPT-4o Compared With Human Researchers in Writing Plain-Language Summaries for Cochrane Reviews: A Blinded, Randomized Non-Inferiority Controlled Trial.ChatGPT-4o与人类研究人员在为Cochrane系统评价撰写通俗易懂的总结方面的比较:一项双盲、随机非劣效性对照试验。
Cochrane Evid Synth Methods. 2025 Jul 28;3(4):e70037. doi: 10.1002/cesm.70037. eCollection 2025 Jul.
7
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
8
Using AI to Write a Review Article Examining the Role of the Nervous System on Skeletal Homeostasis and Fracture Healing.利用人工智能撰写一篇综述文章,探讨神经系统在骨骼稳态和骨折愈合中的作用。
Curr Osteoporos Rep. 2024 Feb;22(1):217-221. doi: 10.1007/s11914-023-00854-y. Epub 2024 Jan 13.
9
The Use of Artificial Intelligence in Writing Scientific Review Articles.人工智能在撰写科学综述文章中的应用。
Curr Osteoporos Rep. 2024 Feb;22(1):115-121. doi: 10.1007/s11914-023-00852-0. Epub 2024 Jan 16.
10
The Utility of AI in Writing a Scientific Review Article on the Impacts of COVID-19 on Musculoskeletal Health.人工智能在撰写关于 COVID-19 对肌肉骨骼健康影响的科学综述文章中的效用。
Curr Osteoporos Rep. 2024 Feb;22(1):146-151. doi: 10.1007/s11914-023-00855-x. Epub 2024 Jan 13.

本文引用的文献

1
Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector.资深和青年院士与人工智能检测器及相似度检测器对牙科相关ChatGPT生成摘要的识别
Sci Rep. 2025 Apr 2;15(1):11275. doi: 10.1038/s41598-025-95387-y.
2
Evaluating human ability to distinguish between ChatGPT-generated and original scientific abstracts.评估人类区分由ChatGPT生成的科学摘要和原创科学摘要的能力。
Updates Surg. 2025 Jan 24. doi: 10.1007/s13304-025-02106-3.
3
What Is the Performance of ChatGPT in Determining the Gender of Individuals Based on Their First and Last Names?
ChatGPT根据名字确定个人性别的表现如何?
JMIR AI. 2024 Mar 13;3:e53656. doi: 10.2196/53656.
4
How well does NamSor perform in predicting the country of origin and ethnicity of individuals based on their first and last names?基于一个人的名字,NamSor 在预测其原籍国和种族方面的表现如何?
PLoS One. 2023 Nov 16;18(11):e0294562. doi: 10.1371/journal.pone.0294562. eCollection 2023.
5
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.评估 ChatGPT 在整个临床工作流程中的效用:开发和可用性研究。
J Med Internet Res. 2023 Aug 22;25:e48659. doi: 10.2196/48659.
6
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
7
Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers.使用检测器和不知情的人类评审员,将ChatGPT生成的科学摘要与真实摘要进行比较。
NPJ Digit Med. 2023 Apr 26;6(1):75. doi: 10.1038/s41746-023-00819-6.
8
ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns.ChatGPT在医学教育、研究与实践中的应用:对其前景与合理担忧的系统评价
Healthcare (Basel). 2023 Mar 19;11(6):887. doi: 10.3390/healthcare11060887.
9
Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现:使用大语言模型进行人工智能辅助医学教育的潜力。
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.
10
ChatGPT is fun, but not an author.ChatGPT 很有趣,但不是作者。
Science. 2023 Jan 27;379(6630):313. doi: 10.1126/science.adg7879. Epub 2023 Jan 26.