Suppr超能文献

生物医学文献中词汇意义的变化揭示了大流行病和新技术。

Changing word meanings in biomedical literature reveal pandemics and new technologies.

作者信息

Nicholson David N, Alquaddoomi Faisal, Rubinetti Vincent, Greene Casey S

机构信息

Genomics and Computational Biology Program, University of Pennsylvania, Philadelpia, PA, USA.

Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA.

出版信息

BioData Min. 2023 May 5;16(1):16. doi: 10.1186/s13040-023-00332-2.

Abstract

While we often think of words as having a fixed meaning that we use to describe a changing world, words are also dynamic and changing. Scientific research can also be remarkably fast-moving, with new concepts or approaches rapidly gaining mind share. We examined scientific writing, both preprint and pre-publication peer-reviewed text, to identify terms that have changed and examine their use. One particular challenge that we faced was that the shift from closed to open access publishing meant that the size of available corpora changed by over an order of magnitude in the last two decades. We developed an approach to evaluate semantic shift by accounting for both intra- and inter-year variability using multiple integrated models. This analysis revealed thousands of change points in both corpora, including for terms such as 'cas9', 'pandemic', and 'sars'. We found that the consistent change-points between pre-publication peer-reviewed and preprinted text are largely related to the COVID-19 pandemic. We also created a web app for exploration that allows users to investigate individual terms ( https://greenelab.github.io/word-lapse/ ). To our knowledge, our research is the first to examine semantic shift in biomedical preprints and pre-publication peer-reviewed text, and provides a foundation for future work to understand how terms acquire new meanings and how peer review affects this process.

摘要

虽然我们常常认为词汇具有固定的含义,用于描述不断变化的世界,但词汇本身也是动态变化的。科学研究的发展速度也非常快,新的概念或方法迅速获得关注。我们研究了科学写作,包括预印本和出版前经过同行评审的文本,以识别发生变化的术语并考察它们的使用情况。我们面临的一个特殊挑战是,从封闭获取出版向开放获取出版的转变意味着,在过去二十年中,可用语料库的规模变化超过了一个数量级。我们开发了一种方法,通过使用多个综合模型来考虑年内和年间的变异性,从而评估语义变化。该分析揭示了两个语料库中的数千个变化点,包括“cas9”“大流行”和“非典”等术语。我们发现,出版前经过同行评审的文本和预印本之间一致的变化点在很大程度上与新冠疫情有关。我们还创建了一个用于探索的网络应用程序,用户可以通过它研究单个术语(https://greenelab.github.io/word-lapse/ )。据我们所知,我们的研究首次考察了生物医学预印本和出版前经过同行评审的文本中的语义变化,并为未来理解术语如何获得新含义以及同行评审如何影响这一过程的研究奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07f3/10161671/df52622babd5/13040_2023_332_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验