Kutler Rachel B, Setzen Sean A, Tsai Samantha, Rameau Anaïs
Department of Otolaryngology - Head and Neck Surgery, Sean Parker Institute for the Voice, Weill Cornell Medical College, New York, New York, USA.
Laryngoscope. 2025 Apr 25. doi: 10.1002/lary.32202.
Since the release of ChatGPT-4 in March 2023, large language models (LLMs) application in biomedical manuscript production has been widespread. GPT-modified text detectors, such as GPTzero, lack sensitivity and reliability and do not quantify the amount of AI-generated text. However, recent work has identified certain adjectives more frequently used by LLMs that can help identify and quantify LLM-modified text. The aim of this study is to utilize these adjectives to identify LLM-generated text in otolaryngology publications.
Meta-research.
Twenty-five otolaryngology journals were studied between November 2022 and July 2024, encompassing 8751 published works. Articles from countries where ChatGPT-4 is not available were removed, yielding 7702 articles for study inclusion. These publications were analyzed using a Python script to determine the frequency of the top 100 adjectives disproportionately generated by ChatGPT-4.
A significant increase in the frequency of adjectives associated with GPT use was observed from November 2023 to July 2024 across all journals (p < 0.001), with a significant difference before and after the release of ChatGPT in March 2023. Journals with higher impact factors had significantly lower usage of GPT-associated adjectives than those with lower impact factors (p < 0.001). There was no significant difference in GPT-associated adjective use by first authors with a doctoral degree versus those without. Publications by authors from English-speaking countries demonstrated a significantly more frequent use of LLM-associated adjectives (p < 0.001).
This study suggests that ChatGPT use in otolaryngology manuscript production has significantly increased since the release of ChatGPT-4. Future research should be aimed at further characterizing the landscape of AI-generated text in otolaryngology and developing tools that encourage authors' transparency regarding the use of LLMs.
NA.
自2023年3月ChatGPT-4发布以来,大语言模型(LLMs)在生物医学稿件撰写中的应用已十分广泛。诸如GPTzero等GPT修改文本检测器缺乏敏感性和可靠性,且无法量化人工智能生成文本的数量。然而,最近的研究发现了一些大语言模型更频繁使用的形容词,这些形容词有助于识别和量化大语言模型修改过的文本。本研究的目的是利用这些形容词来识别耳鼻喉科出版物中由大语言模型生成的文本。
元研究。
在2022年11月至2024年7月期间研究了25种耳鼻喉科期刊,涵盖8751篇已发表作品。剔除来自无法获取ChatGPT-4的国家的文章,得到7702篇纳入研究的文章。使用Python脚本对这些出版物进行分析,以确定ChatGPT-4不成比例生成的前100个形容词的出现频率。
从2023年11月到2024年7月,所有期刊中与GPT使用相关的形容词出现频率显著增加(p < 0.001),在2023年3月ChatGPT发布前后存在显著差异。影响因子较高的期刊中与GPT相关的形容词使用频率显著低于影响因子较低的期刊(p < 0.001)。拥有博士学位的第一作者与没有博士学位的第一作者在使用与GPT相关的形容词方面没有显著差异。来自英语国家的作者发表的文章中与大语言模型相关的形容词使用频率显著更高(p < 0.001)。
本研究表明,自ChatGPT-4发布以来,ChatGPT在耳鼻喉科稿件撰写中的使用显著增加。未来的研究应旨在进一步刻画耳鼻喉科中人工智能生成文本的情况,并开发鼓励作者在使用大语言模型方面保持透明的工具。
无。