Suppr超能文献

生物医学研究的全景

The landscape of biomedical research.

作者信息

González-Márquez Rita, Schmidt Luca, Schmidt Benjamin M, Berens Philipp, Kobak Dmitry

机构信息

Hertie Institute for AI in Brain Health, University of Tübingen, Germany.

Tübingen AI Center, Tübingen, Germany.

出版信息

Patterns (N Y). 2024 Apr 9;5(6):100968. doi: 10.1016/j.patter.2024.100968. eCollection 2024 Jun 14.

Abstract

The number of publications in biomedicine and life sciences has grown so much that it is difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here, we present a two-dimensional (2D) map of the entire corpus of biomedical literature, based on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined with -SNE tailored to handle samples of this size. We used our map to study the emergence of the COVID-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive website that allows easy exploration and will enable further insights and facilitate future research.

摘要

生物医学和生命科学领域的出版物数量增长如此之多,以至于很难跟踪新的科学著作并全面了解该领域的整体发展。在此,我们基于来自PubMed数据库的2100万篇英文文章的摘要文本,呈现了生物医学文献全集的二维(2D)地图。为了将摘要嵌入到二维空间中,我们使用了大型语言模型PubMedBERT,并结合了专门用于处理这种规模样本的t-SNE算法。我们利用我们的地图研究了COVID-19文献的出现、神经科学学科的发展、机器学习的应用、学术作者性别失衡的分布以及撤稿的论文工厂文章的分布。此外,我们还提供了一个交互式网站,便于进行探索,并将有助于获得进一步的见解和推动未来的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b56/11240179/1aeadd79d2a9/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验