Suppr超能文献

ChatGPT生成的医学内容中虚假和不准确参考文献的高比例。

High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content.

作者信息

Bhattacharyya Mehul, Miller Valerie M, Bhattacharyya Debjani, Miller Larry E

机构信息

Clinical Research, Miller Scientific, Johnson City, USA.

Leadership, University of the Cumberlands, Williamsburg, USA.

出版信息

Cureus. 2023 May 19;15(5):e39238. doi: 10.7759/cureus.39238. eCollection 2023 May.

Abstract

Background The availability of large language models such as Chat Generative Pre-trained Transformer (ChatGPT, OpenAI) has enabled individuals from diverse backgrounds to access medical information. However, concerns exist about the accuracy of ChatGPT responses and the references used to generate medical content. Methods This observational study investigated the authenticity and accuracy of references in medical articles generated by ChatGPT. ChatGPT-3.5 generated 30 short medical papers, each with at least three references, based on standardized prompts encompassing various topics and therapeutic areas. Reference authenticity and accuracy were verified by searching Medline, Google Scholar, and the Directory of Open Access Journals. The authenticity and accuracy of individual ChatGPT-generated reference elements were also determined. Results Overall, 115 references were generated by ChatGPT, with a mean of 3.8±1.1 per paper. Among these references, 47% were fabricated, 46% were authentic but inaccurate, and only 7% were authentic and accurate. The likelihood of fabricated references significantly differed based on prompt variations; yet the frequency of authentic and accurate references remained low in all cases. Among the seven components evaluated for each reference, an incorrect PMID number was most common, listed in 93% of papers. Incorrect volume (64%), page numbers (64%), and year of publication (60%) were the next most frequent errors. The mean number of inaccurate components was 4.3±2.8 out of seven per reference. Conclusions The findings of this study emphasize the need for caution when seeking medical information on ChatGPT since most of the references provided were found to be fabricated or inaccurate. Individuals are advised to verify medical information from reliable sources and avoid relying solely on artificial intelligence-generated content.

摘要

背景 诸如聊天生成预训练变换器(ChatGPT,OpenAI)之类的大语言模型的出现,使来自不同背景的个人都能够获取医学信息。然而,人们对ChatGPT回答的准确性以及用于生成医学内容的参考文献存在担忧。方法 这项观察性研究调查了ChatGPT生成的医学文章中参考文献的真实性和准确性。ChatGPT-3.5根据涵盖各种主题和治疗领域的标准化提示生成了30篇简短的医学论文,每篇论文至少有三篇参考文献。通过检索Medline、谷歌学术和开放获取期刊目录来验证参考文献的真实性和准确性。还确定了ChatGPT生成的各个参考文献元素的真实性和准确性。结果 总体而言,ChatGPT生成了115条参考文献,每篇论文平均有3.8±1.1条。在这些参考文献中,47%是编造的,46%是真实但不准确的,只有7%是真实且准确的。编造参考文献的可能性因提示变化而有显著差异;然而,在所有情况下,真实且准确的参考文献的频率仍然很低。在对每个参考文献评估的七个组成部分中,最常见的是PMID编号错误,在93%的论文中都有列出。其次最常见的错误是卷号(64%)、页码(64%)和出版年份(60%)。每个参考文献不准确组成部分的平均数为4.3±2.8(共七个)。结论 本研究结果强调,在ChatGPT上获取医学信息时需要谨慎,因为发现提供的大多数参考文献都是编造的或不准确的。建议个人从可靠来源核实医学信息,避免仅依赖人工智能生成的内容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e033/10277170/2da30fcf1f8f/cureus-0015-00000039238-i01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验