Suppr超能文献

量化科学论文中大型语言模型的使用情况。

Quantifying large language model usage in scientific papers.

作者信息

Liang Weixin, Zhang Yaohui, Wu Zhengxuan, Lepp Haley, Ji Wenlong, Zhao Xuandong, Cao Hancheng, Liu Sheng, He Siyu, Huang Zhi, Yang Diyi, Potts Christopher, Manning Christopher D, Zou James

机构信息

Department of Computer Science, Stanford University, Stanford, CA, USA.

Department of Electrical Engineering, Stanford University, Stanford, CA, USA.

出版信息

Nat Hum Behav. 2025 Aug 4. doi: 10.1038/s41562-025-02273-8.

Abstract

Scientific publishing is the primary means of disseminating research findings. There has been speculation about how extensively large language models (LLMs) are being used in academic writing. Here we conduct a systematic analysis across 1,121,912 preprints and published papers from January 2020 to September 2024 on arXiv, bioRxiv and Nature portfolio journals, using a population-level framework based on word frequency shifts to estimate the prevalence of LLM-modified content over time. Our findings suggest a steady increase in LLM usage, with the largest and fastest growth estimated for computer science papers (up to 22%). By comparison, mathematics papers and the Nature portfolio showed lower evidence of LLM modification (up to 9%). LLM modification estimates were higher among papers from first authors who post preprints more frequently, papers in more crowded research areas and papers of shorter lengths. Our findings suggest that LLMs are being broadly used in scientific writing.

摘要

科学出版是传播研究成果的主要手段。关于大语言模型(LLMs)在学术写作中的使用程度一直存在猜测。在此,我们基于词频变化的总体水平框架,对2020年1月至2024年9月期间arXiv、bioRxiv和《自然》系列期刊上的1,121,912篇预印本和已发表论文进行了系统分析,以估计随时间推移大语言模型修改内容的流行程度。我们的研究结果表明大语言模型的使用稳步增加,计算机科学论文的增长幅度最大且最快(高达22%)。相比之下,数学论文和《自然》系列期刊显示出大语言模型修改的证据较少(高达9%)。在更频繁发布预印本的第一作者的论文、研究领域更热门的论文以及篇幅较短的论文中,大语言模型修改的估计比例更高。我们的研究结果表明大语言模型正在广泛应用于科学写作。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验