Suppr超能文献

量化骨科医学文献中人工智能辅助写作的范围:人工智能检测软件的患病率分析与验证

Quantifying the Scope of Artificial Intelligence-Assisted Writing in Orthopaedic Medical Literature: An Analysis of Prevalence and Validation of AI-Detection Software.

作者信息

Porto Joshua R, Morgan Kerry A, Hecht Christian J, Burkhart Robert J, Liu Raymond W

机构信息

From the Department of Orthopaedic Surgery, University Hospitals of Cleveland, Case Western Reserve University, Cleveland, OH (Porto, Morgan, Hecht, Burkhart, and Liu), and the Case Western Reserve University School of Medicine, Cleveland, OH (Porto, Morgan, and Hecht).

出版信息

J Am Acad Orthop Surg. 2025 Jan 1;33(1):42-50. doi: 10.5435/JAAOS-D-24-00084. Epub 2024 Nov 19.

Abstract

INTRODUCTION

The popularization of generative artificial intelligence (AI), including Chat Generative Pre-trained Transformer (ChatGPT), has raised concerns for the integrity of academic literature. This study asked the following questions: (1) Has the popularization of publicly available generative AI, such as ChatGPT, increased the prevalence of AI-generated orthopaedic literature? (2) Can AI detectors accurately identify ChatGPT-generated text? (3) Are there associations between article characteristics and the likelihood that it was AI generated?

METHODS

PubMed was searched across six major orthopaedic journals to identify articles received for publication after January 1, 2023. Two hundred and forty articles were randomly selected and entered into three popular AI detectors. Twenty articles published by each journal before the release of ChatGPT were randomly selected as negative control articles. 36 positive control articles (6 per journal) were created by altering 25%, 50%, and 100% of text from negative control articles using ChatGPT and were then used to validate each detector. The mean percentage of text detected as written by AI per detector was compared between pre-ChatGPT and post-ChatGPT release articles using independent t -test. Multivariate regression analysis was conducted using percentage AI-generated text per journal, article type (ie, cohort, clinical trial, review), and month of submission.

RESULTS

One AI detector consistently and accurately identified AI-generated text in positive control articles, whereas two others showed poor sensitivity and specificity. The most accurate detector showed a modest increase in the percentage AI detected for the articles received post release of ChatGPT (+1.8%, P = 0.01). Regression analysis showed no consistent associations between likelihood of AI-generated text per journal, article type, or month of submission.

CONCLUSIONS

As this study found an early, albeit modest, effect of generative AI on the orthopaedic literature, proper oversight will play a critical role in maintaining research integrity and accuracy. AI detectors may play a critical role in regulatory efforts, although they will require further development and standardization to the interpretation of their results.

摘要

引言

包括聊天生成预训练变换器(ChatGPT)在内的生成式人工智能(AI)的普及引发了对学术文献完整性的担忧。本研究提出了以下问题:(1)诸如ChatGPT之类的公开可用生成式AI的普及是否增加了AI生成的骨科文献的流行率?(2)AI检测器能否准确识别ChatGPT生成的文本?(3)文章特征与AI生成的可能性之间是否存在关联?

方法

在六种主要骨科期刊上检索PubMed,以识别2023年1月1日之后收到发表的文章。随机选择240篇文章并输入到三种流行的AI检测器中。在ChatGPT发布之前,每个期刊随机选择20篇已发表的文章作为阴性对照文章。通过使用ChatGPT改变阴性对照文章25%、50%和100%的文本创建36篇阳性对照文章(每个期刊6篇),然后用于验证每个检测器。使用独立t检验比较ChatGPT发布前和发布后文章中每个检测器检测到的AI编写文本的平均百分比。使用每个期刊AI生成文本的百分比、文章类型(即队列研究、临床试验、综述)和提交月份进行多变量回归分析。

结果

一种AI检测器始终如一地准确识别阳性对照文章中AI生成的文本,而另外两种检测器的敏感性和特异性较差。最准确的检测器显示,ChatGPT发布后收到的文章中检测到的AI百分比有适度增加(+1.8%,P = 0.01)。回归分析表明,每个期刊AI生成文本的可能性、文章类型或提交月份之间没有一致的关联。

结论

由于本研究发现生成式AI对骨科文献有早期影响,尽管影响不大,但适当的监督对于维护研究的完整性和准确性将发挥关键作用。AI检测器可能在监管工作中发挥关键作用,尽管它们需要进一步发展并对结果的解释进行标准化。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验