Kobak Dmitry, González-Márquez Rita, Horvát Emőke-Ágnes, Lause Jan
Hertie Institute for AI in Brain Health, University of Tübingen, 72076 Tübingen, Germany.
Northwestern University, Evanston, 60208 IL, USA.
Sci Adv. 2025 Jul 4;11(27):eadt3813. doi: 10.1126/sciadv.adt3813. Epub 2025 Jul 2.
Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.
像ChatGPT这样的大语言模型(LLMs)能够以人类水平的表现生成和修改文本。这些模型存在明显的局限性,可能会产生不准确的信息,并强化现有的偏见。然而,许多科学家在学术写作中使用它们。但是,这种大语言模型在学术文献中的使用有多广泛呢?为了回答生物医学研究领域的这个问题,我们提出了一种无偏见的大规模方法:我们研究了2010年至2024年由PubMed索引的超过1500万篇生物医学摘要中的词汇变化,并展示了大语言模型的出现如何导致某些风格词汇的频率突然增加。这种多余词汇分析表明,2024年至少13.5%的摘要使用大语言模型进行了处理。这个下限在不同学科、国家和期刊中有所不同,某些子语料库达到了40%。我们表明,大语言模型对生物医学研究中的科学写作产生了前所未有的影响,超过了诸如新冠疫情等重大世界事件所产生的影响。