Collins Annie, Alexander Rohan
University of Toronto, Toronto, Canada.
Scientometrics. 2022;127(8):4655-4673. doi: 10.1007/s11192-022-04418-2. Epub 2022 Jul 4.
To examine the reproducibility of COVID-19 research, we create a dataset of pre-prints posted to arXiv, bioRxiv, and medRxiv between 28 January 2020 and 30 June 2021 that are related to COVID-19. We extract the text from these pre-prints and parse them looking for keyword markers signaling the availability of the data and code underpinning the pre-print. For the pre-prints that are in our sample, we are unable to find markers of either open data or open code for 75% of those on arXiv, 67% of those on bioRxiv, and 79% of those on medRxiv.
为了检验新冠病毒研究的可重复性,我们创建了一个数据集,该数据集包含2020年1月28日至2021年6月30日期间发布在arXiv、bioRxiv和medRxiv上与新冠病毒相关的预印本。我们从这些预印本中提取文本,并对其进行解析,以寻找表明支撑该预印本的数据和代码可用性的关键词标记。对于我们样本中的预印本,我们在arXiv上75%的预印本、bioRxiv上67%的预印本以及medRxiv上79%的预印本中均未找到开放数据或开放代码的标记。