Suppr
超能文献

人工智能可以生成虚假但看起来真实的科学医学文章：潘多拉的盒子已经被打开。

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.

机构信息

Department of Neurosurgery and Neurooncology, First Faculty of Medicine, Charles University, Prague, Czech Republic.

Department of Psychiatry, Faculty of Medicine in Pilsen, Charles University, Pilsen, Czech Republic.

出版信息

J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.

DOI:10.2196/46924

PMID:37256685

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10267787/

Abstract

BACKGROUND

Artificial intelligence (AI) has advanced substantially in recent years, transforming many industries and improving the way people live and work. In scientific research, AI can enhance the quality and efficiency of data analysis and publication. However, AI has also opened up the possibility of generating high-quality fraudulent papers that are difficult to detect, raising important questions about the integrity of scientific research and the trustworthiness of published papers.

OBJECTIVE

The aim of this study was to investigate the capabilities of current AI language models in generating high-quality fraudulent medical articles. We hypothesized that modern AI models can create highly convincing fraudulent papers that can easily deceive readers and even experienced researchers.

METHODS

This proof-of-concept study used ChatGPT (Chat Generative Pre-trained Transformer) powered by the GPT-3 (Generative Pre-trained Transformer 3) language model to generate a fraudulent scientific article related to neurosurgery. GPT-3 is a large language model developed by OpenAI that uses deep learning algorithms to generate human-like text in response to prompts given by users. The model was trained on a massive corpus of text from the internet and is capable of generating high-quality text in a variety of languages and on various topics. The authors posed questions and prompts to the model and refined them iteratively as the model generated the responses. The goal was to create a completely fabricated article including the abstract, introduction, material and methods, discussion, references, charts, etc. Once the article was generated, it was reviewed for accuracy and coherence by experts in the fields of neurosurgery, psychiatry, and statistics and compared to existing similar articles.

RESULTS

The study found that the AI language model can create a highly convincing fraudulent article that resembled a genuine scientific paper in terms of word usage, sentence structure, and overall composition. The AI-generated article included standard sections such as introduction, material and methods, results, and discussion, as well a data sheet. It consisted of 1992 words and 17 citations, and the whole process of article creation took approximately 1 hour without any special training of the human user. However, there were some concerns and specific mistakes identified in the generated article, specifically in the references.

CONCLUSIONS

The study demonstrates the potential of current AI language models to generate completely fabricated scientific articles. Although the papers look sophisticated and seemingly flawless, expert readers may identify semantic inaccuracies and errors upon closer inspection. We highlight the need for increased vigilance and better detection methods to combat the potential misuse of AI in scientific research. At the same time, it is important to recognize the potential benefits of using AI language models in genuine scientific writing and research, such as manuscript preparation and language editing.

摘要

背景

人工智能（AI）近年来取得了实质性进展，改变了许多行业，并提高了人们的生活和工作方式。在科学研究中，AI 可以提高数据分析和发表的质量和效率。然而，AI 也开辟了生成高质量欺诈性论文的可能性，这些论文很难被发现，这就提出了关于科学研究诚信和已发表论文可信度的重要问题。

目的

本研究旨在调查当前 AI 语言模型在生成高质量医疗欺诈性文章方面的能力。我们假设现代 AI 模型可以生成极具说服力的欺诈性论文，这些论文很容易欺骗读者，甚至是有经验的研究人员。

方法

这是一项概念验证研究，使用 ChatGPT（基于 GPT-3 的聊天生成预训练转换器）生成与神经外科相关的虚假科学文章。GPT-3 是由 OpenAI 开发的大型语言模型，它使用深度学习算法根据用户提供的提示生成类似人类的文本。该模型经过来自互联网的大量文本语料库的训练，能够用各种语言和各种主题生成高质量的文本。作者向模型提出问题和提示，并在模型生成回复时迭代地对其进行细化。目标是创建一篇完全虚构的文章，包括摘要、引言、材料和方法、讨论、参考文献、图表等。文章生成后，由神经外科、精神病学和统计学领域的专家对其准确性和连贯性进行审查，并与现有类似文章进行比较。

结果

研究发现，AI 语言模型可以创建一篇极具说服力的欺诈性文章，在词汇使用、句子结构和整体组成方面与真正的科学论文相似。AI 生成的文章包括引言、材料和方法、结果和讨论等标准部分，以及数据表。它由 1992 个单词和 17 个引文组成，整个文章创建过程大约需要 1 小时，而无需对人类用户进行任何特殊培训。然而，在生成的文章中发现了一些关注和特定的错误，特别是在参考文献中。

结论

该研究表明，当前的 AI 语言模型有潜力生成完全虚构的科学文章。虽然这些论文看起来很复杂，似乎无懈可击，但有经验的读者在仔细检查时可能会发现语义上的不准确和错误。我们强调需要提高警惕并开发更好的检测方法来打击 AI 在科学研究中可能的滥用。同时，认识到在真正的科学写作和研究中使用 AI 语言模型的潜在好处也很重要，例如手稿准备和语言编辑。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/015e/10267787/5319fb8c7304/jmir_v25i1e46924_fig1.jpg

相似文献

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.

J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.

AI Increases the Pressure to Overhaul the Scientific Peer Review Process. Comment on "Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened".

J Med Internet Res. 2023 Aug 31;25:e50591. doi: 10.2196/50591.

AI Is Changing the Landscape of Academic Writing: What Can Be Done? Authors' Reply to: AI Increases the Pressure to Overhaul the Scientific Peer Review Process. Comment on "Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened".

J Med Internet Res. 2023 Aug 31;25:e50844. doi: 10.2196/50844.

Exploring the Boundaries of Reality: Investigating the Phenomenon of Artificial Intelligence Hallucination in Scientific Writing Through ChatGPT References.

Cureus. 2023 Apr 11;15(4):e37432. doi: 10.7759/cureus.37432. eCollection 2023 Apr.

Open Science and Software Assistance: Commentary on "Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened".

J Med Internet Res. 2023 May 31;25:e49323. doi: 10.2196/49323.

Evaluation of the impact of large language learning models on articles submitted to Orthopaedics & Traumatology: Surgery & Research (OTSR): A significant increase in the use of artificial intelligence in 2023.

Orthop Traumatol Surg Res. 2023 Dec;109(8):103720. doi: 10.1016/j.otsr.2023.103720. Epub 2023 Oct 20.

A comparison of ChatGPT-generated articles with human-written articles.

Skeletal Radiol. 2023 Sep;52(9):1755-1758. doi: 10.1007/s00256-023-04340-5. Epub 2023 Apr 14.

Artificial Intelligence-Generated Scientific Literature: A Critical Appraisal.

J Allergy Clin Immunol Pract. 2024 Jan;12(1):106-110. doi: 10.1016/j.jaip.2023.10.010. Epub 2023 Oct 12.

Can artificial intelligence help for scientific writing?

Crit Care. 2023 Feb 25;27(1):75. doi: 10.1186/s13054-023-04380-2.

Artificial intelligence in scientific writing: a friend or a foe?

Reprod Biomed Online. 2023 Jul;47(1):3-9. doi: 10.1016/j.rbmo.2023.04.009. Epub 2023 Apr 20.

引用本文的文献

Recommendations for disclosure of artificial intelligence in scientific writing and publishing: a regional anesthesia and pain medicine modified Delphi study.

Reg Anesth Pain Med. 2025 Sep 2. doi: 10.1136/rapm-2025-106852.

Generative AI and academic scientists in US universities: Perception, experience, and adoption intentions.

PLoS One. 2025 Aug 28;20(8):e0330416. doi: 10.1371/journal.pone.0330416. eCollection 2025.

Identification and Categorization of the Top 100 Articles and the Future of Large Language Models: Thematic Analysis Using Bibliometric Analysis.

JMIR AI. 2025 Aug 27;4:e68603. doi: 10.2196/68603.

Regulating the unseen hand: AI, authorship, and trust in medical science.

Ann Med Surg (Lond). 2025 Jun 16;87(8):5353-5355. doi: 10.1097/MS9.0000000000003498. eCollection 2025 Aug.

The Reliability Gap: How Traditional Search Engines Outperform Artificial Intelligence (AI) Chatbots in Rosacea Public Health Information Quality.

Cureus. 2025 Jun 22;17(6):e86543. doi: 10.7759/cureus.86543. eCollection 2025 Jun.

Artificial Intelligence in Medical Education: Promise, Pitfalls, and Practical Pathways.

Adv Med Educ Pract. 2025 Jun 14;16:1039-1046. doi: 10.2147/AMEP.S523255. eCollection 2025.

Large Language Models in Medicine: Applications, Challenges, and Future Directions.

Int J Med Sci. 2025 May 31;22(11):2792-2801. doi: 10.7150/ijms.111780. eCollection 2025.

Trustworthiness assessment of published clinical trials: Literature review of domains and questions.

Cochrane Evid Synth Methods. 2024 Aug 20;2(8):e12099. doi: 10.1002/cesm.12099. eCollection 2024 Aug.

Comparing orthodontic pre-treatment information provided by large language models.

BMC Oral Health. 2025 May 28;25(1):838. doi: 10.1186/s12903-025-06246-1.

Confirming SPSS Results With ChatGPT-4 and o3-mini Models.

Cureus. 2025 Apr 10;17(4):e82005. doi: 10.7759/cureus.82005. eCollection 2025 Apr.

本文引用的文献

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.

PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.

Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.

Cureus. 2023 Feb 19;15(2):e35179. doi: 10.7759/cureus.35179. eCollection 2023 Feb.

ChatGPT and other artificial intelligence applications speed up scientific writing.

J Chin Med Assoc. 2023 Apr 1;86(4):351-353. doi: 10.1097/JCMA.0000000000000900. Epub 2023 Feb 14.

Generating scholarly content with ChatGPT: ethical challenges for medical publishing.

Lancet Digit Health. 2023 Mar;5(3):e105-e106. doi: 10.1016/S2589-7500(23)00019-5. Epub 2023 Feb 6.

ChatGPT Is Shaping the Future of Medical Writing But Still Requires Human Judgment.

Radiology. 2023 Apr;307(2):e230171. doi: 10.1148/radiol.230171. Epub 2023 Feb 2.

Nonhuman "Authors" and Implications for the Integrity of Scientific Publication and Medical Knowledge.

JAMA. 2023 Feb 28;329(8):637-639. doi: 10.1001/jama.2023.1344.

Abstracts written by ChatGPT fool scientists.

Nature. 2023 Jan;613(7944):423. doi: 10.1038/d41586-023-00056-7.

Deep Brain Stimulation for Depression.

Neurotherapeutics. 2022 Jul;19(4):1229-1245. doi: 10.1007/s13311-022-01270-3. Epub 2022 Jul 11.

Fraud and retraction in perioperative medicine publications: what we learned and what can be implemented to prevent future recurrence.

J Med Ethics. 2022 Jul;48(7):479-484. doi: 10.1136/medethics-2021-107252. Epub 2021 May 14.

Deep Brain Stimulation in Treatment-Resistant Depression: A Systematic Review and Meta-Analysis on Efficacy and Safety.

Front Neurosci. 2021 Apr 1;15:655412. doi: 10.3389/fnins.2021.655412. eCollection 2021.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

人工智能可以生成虚假但看起来真实的科学医学文章：潘多拉的盒子已经被打开。

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译