Suppr超能文献

使用 xFakeSci 学习算法检测 ChatGPT 生成的虚假科学内容。

Detection of ChatGPT fake science with the xFakeSci learning algorithm.

机构信息

Complex Adaptive Systems and Computational Intelligence Laboratory, State University of New York at Binghamton, Binghamton, NY, 13902, USA.

Hefei University of Technology, Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei, 230009, China.

出版信息

Sci Rep. 2024 Jul 14;14(1):16231. doi: 10.1038/s41598-024-66784-6.

Abstract

Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that "AI generated content may exhibit a distinctive behavior that can be separated from scientific articles". In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources. To mitigate overfitting issues, we incorporated a calibration step that is built upon data-driven heuristics, including proximity and ratios. Specifically, from a total of a 3952 fake articles for three different medical conditions, the algorithm was trained using only 100 articles, but calibrated using folds of 100 articles. As for the classification step, it was performed using 300 articles per condition. The actual label steps took place against an equal mix of 50 generated articles and 50 authentic PubMed abstracts. The testing also spanned publication periods from 2010 to 2024 and encompassed research on three distinct diseases: cancer, depression, and Alzheimer's. Further, we evaluated the accuracy of the xFakeSci algorithm against some of the classical data mining algorithms (e.g., Support Vector Machines, Regression, and Naive Bayes). The xFakeSci algorithm achieved F1 scores ranging from 80 to 94%, outperforming common data mining algorithms, which scored F1 values between 38 and 52%. We attribute the noticeable difference to the introduction of calibration and a proximity distance heuristic, which underscores this promising performance. Indeed, the prediction of fake science generated by ChatGPT presents a considerable challenge. Nonetheless, the introduction of the xFakeSci algorithm is a significant step on the way to combating fake science.

摘要

生成式人工智能工具(如 ChatGPT)正成为新的现实。本研究的前提是“人工智能生成的内容可能表现出一种独特的行为,可以与科学文章区分开来”。在本研究中,我们展示了如何使用提示工程手段为各种疾病和病症生成文章。然后,我们展示了如何分两个阶段测试这一前提,并证明其有效性。随后,我们引入了一种新的学习算法 xFakeSci,它能够区分 ChatGPT 生成的文章和科学家发表的文章。该算法使用来自两个来源的网络模型进行训练。为了减轻过拟合问题,我们在数据驱动的启发式方法(包括接近度和比率)的基础上加入了校准步骤。具体来说,在总共 3952 篇用于三种不同医疗条件的虚假文章中,我们使用了 100 篇文章进行训练,但使用了 100 篇文章的折叠进行校准。对于分类步骤,我们在每个条件下使用 300 篇文章进行。实际的标签步骤是在 50 篇生成文章和 50 篇真实 PubMed 摘要的混合中进行的。测试还涵盖了从 2010 年到 2024 年的出版周期,并涉及三种不同疾病的研究:癌症、抑郁症和阿尔茨海默病。此外,我们还评估了 xFakeSci 算法相对于一些经典数据挖掘算法(如支持向量机、回归和朴素贝叶斯)的准确性。xFakeSci 算法的 F1 得分在 80%到 94%之间,优于常见的数据挖掘算法,后者的 F1 值在 38%到 52%之间。我们将这一显著差异归因于校准和接近距离启发式的引入,这突显了这一有希望的性能。事实上,预测 ChatGPT 生成的虚假科学是一个相当大的挑战。尽管如此,xFakeSci 算法的引入是打击虚假科学的重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58d5/11247077/7bae0342c2e1/41598_2024_66784_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验