European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom.
PLoS One. 2024 Sep 26;19(9):e0303005. doi: 10.1371/journal.pone.0303005. eCollection 2024.
Preprints provide an indispensable tool for rapid and open communication of early research findings. Preprints can also be revised and improved based on scientific commentary uncoupled from journal-organised peer review. The uptake of preprints in the life sciences has increased significantly in recent years, especially during the COVID-19 pandemic, when immediate access to research findings became crucial to address the global health emergency. With ongoing expansion of new preprint servers, improving discoverability of preprints is a necessary step to facilitate wider sharing of the science reported in preprints. To address the challenges of preprint visibility and reuse, Europe PMC, an open database of life science literature, began indexing preprint abstracts and metadata from several platforms in July 2018. Since then, Europe PMC has continued to increase coverage through addition of new servers, and expanded its preprint initiative to include the full text of preprints related to COVID-19 in July 2020 and then the full text of preprints supported by the Europe PMC funder consortium in April 2022. The preprint collection can be searched via the website and programmatically, with abstracts and the open access full text of COVID-19 and Europe PMC funder preprint subsets available for bulk download in a standard machine-readable JATS XML format. This enables automated information extraction for large-scale analyses of the preprint corpus, accelerating scientific research of the preprint literature itself. This publication describes steps taken to build trust, improve discoverability, and support reuse of life science preprints in Europe PMC. Here we discuss the benefits of indexing preprints alongside peer-reviewed publications, and challenges associated with this process.
预印本为快速、公开交流早期研究成果提供了不可或缺的工具。预印本还可以根据与期刊组织的同行评审脱钩的科学评论进行修订和改进。近年来,生命科学领域的预印本采用率显著提高,尤其是在 COVID-19 大流行期间,立即获取研究结果对于应对全球卫生紧急情况变得至关重要。随着新的预印本服务器不断扩展,提高预印本的可发现性是促进更广泛共享预印本中报告的科学的必要步骤。为了解决预印本可见性和再利用的挑战,生命科学文献开放数据库 Europe PMC 于 2018 年 7 月开始索引来自多个平台的预印本摘要和元数据。自那时以来,Europe PMC 通过添加新服务器继续扩大覆盖范围,并于 2020 年 7 月将其预印本计划扩展到包括与 COVID-19 相关的预印本全文,然后于 2022 年 4 月扩展到包括由 Europe PMC 资助者联盟支持的预印本全文。可以通过网站和编程方式搜索预印本集合,提供 COVID-19 和 Europe PMC 资助者预印本子集的摘要和开放获取全文,可按标准机器可读 JATS XML 格式批量下载。这使得可以对预印本文献进行大规模分析的自动信息提取,从而加速预印本文献本身的科学研究。本文描述了在 Europe PMC 中建立信任、提高可发现性和支持生命科学预印本再利用所采取的步骤。在这里,我们讨论了将预印本与同行评审出版物并列索引的好处,以及与该过程相关的挑战。