Department of Computer Science and Engineering, UC San Diego, La Jolla, CA, United States of America.
Department of Neuroscience, UC San Diego, La Jolla, CA, United States of America.
PLoS One. 2023 Mar 8;18(3):e0281659. doi: 10.1371/journal.pone.0281659. eCollection 2023.
Preprints, versions of scientific manuscripts that precede peer review, are growing in popularity. They offer an opportunity to democratize and accelerate research, as they have no publication costs or a lengthy peer review process. Preprints are often later published in peer-reviewed venues, but these publications and the original preprints are frequently not linked in any way. To this end, we developed a tool, PreprintMatch, to find matches between preprints and their corresponding published papers, if they exist. This tool outperforms existing techniques to match preprints and papers, both on matching performance and speed. PreprintMatch was applied to search for matches between preprints (from bioRxiv and medRxiv), and PubMed. The preliminary nature of preprints offers a unique perspective into scientific projects at a relatively early stage, and with better matching between preprint and paper, we explored questions related to research inequity. We found that preprints from low income countries are published as peer-reviewed papers at a lower rate than high income countries (39.6% and 61.1%, respectively), and our data is consistent with previous work that cite a lack of resources, lack of stability, and policy choices to explain this discrepancy. Preprints from low income countries were also found to be published quicker (178 vs 203 days) and with less title, abstract, and author similarity to the published version compared to high income countries. Low income countries add more authors from the preprint to the published version than high income countries (0.42 authors vs 0.32, respectively), a practice that is significantly more frequent in China compared to similar countries. Finally, we find that some publishers publish work with authors from lower income countries more frequently than others.
预印本是在同行评审之前发布的科学手稿版本,越来越受欢迎。它们提供了一个民主化和加速研究的机会,因为它们没有出版成本或冗长的同行评审过程。预印本通常后来在同行评审的期刊上发表,但这些出版物和原始预印本通常没有以任何方式联系在一起。为此,我们开发了一种工具 PreprintMatch,用于在存在的情况下在预印本和它们相应的已发表论文之间找到匹配。该工具在匹配性能和速度方面都优于现有的预印本和论文匹配技术。PreprintMatch 被应用于在预印本(来自 bioRxiv 和 medRxiv)和 PubMed 之间搜索匹配。预印本的初步性质提供了一个独特的视角,可以在相对较早的阶段了解科学项目,并且通过更好地匹配预印本和论文,我们探讨了与研究不公平相关的问题。我们发现,来自低收入国家的预印本作为同行评审论文发表的比例低于高收入国家(分别为 39.6%和 61.1%),我们的数据与以前的工作一致,这些工作认为这一差异可以用资源匮乏、不稳定和政策选择来解释。与高收入国家相比,来自低收入国家的预印本发表速度更快(178 天对 203 天),标题、摘要和作者相似度也更低。与高收入国家相比,低收入国家在发表版本中添加的来自预印本的作者更多(分别为 0.42 名作者对 0.32 名作者),这种做法在中国比在类似国家更为常见。最后,我们发现一些出版商比其他出版商更频繁地出版来自低收入国家的作者的作品。