Schmidt Marion, Kircheis Wolfgang, Simons Arno, Potthast Martin, Stein Benno
German Center for Higher Education Research and Science Studies (DZHW), Berlin, Germany.
Leipzig University and Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig, Germany.
Scientometrics. 2023;128(6):3649-3673. doi: 10.1007/s11192-023-04703-8. Epub 2023 May 14.
This paper analyzes Wikipedia's representation of the Nobel Prize winning CRISPR/Cas9 technology, a method for gene editing. We propose and evaluate different heuristics to match publications from several publication corpora against Wikipedia's central article on CRISPR and against the complete Wikipedia revision history in order to retrieve further Wikipedia articles relevant to the topic and to analyze Wikipedia's referencing patterns. We explore to what extent the selection of referenced literature of Wikipedia's central article on CRISPR adheres to scientific standards and inner-scientific perspectives by assessing its overlap with (1) the Web of Science (WoS) database, (2) a WoS-based field-delineated corpus, (3) highly-cited publications within this corpus, and (4) publications referenced by field-specific reviews. We develop a diachronic perspective on citation latency and compare the delays with which publications are cited in relevant Wikipedia articles to the citation dynamics of these publications over time. Our results confirm that a combination of verbatim searches by title, DOI, and PMID is sufficient and cannot be improved significantly by more elaborate search heuristics. We show that Wikipedia references a substantial amount of publications that are recognized by experts and highly cited, but that Wikipedia also cites less visible literature, and, to a certain degree, even not strictly scientific literature. Delays in occurrence on Wikipedia compared to the publication years show (most pronounced in case of the central CRISPR article) a dependence on the dynamics of both the field and the editor's reaction to it in terms of activity.
本文分析了维基百科对诺贝尔奖得主的基因编辑方法CRISPR/Cas9技术的呈现。我们提出并评估了不同的启发式方法,将来自多个出版物语料库的出版物与维基百科关于CRISPR的核心文章以及完整的维基百科修订历史进行匹配,以检索与该主题相关的更多维基百科文章,并分析维基百科的引用模式。我们通过评估维基百科关于CRISPR的核心文章中参考文献的选择与以下内容的重叠程度,来探究其在多大程度上符合科学标准和科学内部观点:(1)科学网(WoS)数据库;(2)基于WoS的按领域划分的语料库;(3)该语料库中的高被引出版物;(4)特定领域综述引用的出版物。我们从历时角度研究了引用延迟,并将相关维基百科文章中引用出版物的延迟与这些出版物随时间的引用动态进行比较。我们的结果证实,通过标题、数字对象标识符(DOI)和医学主题词(PMID)进行逐字搜索的组合就足够了,更精细的搜索启发式方法并不能显著改进。我们表明,维基百科引用了大量被专家认可且被高度引用的出版物,但维基百科也引用了不太知名的文献,并且在一定程度上甚至引用了并非严格意义上的科学文献。与出版年份相比,在维基百科上出现的延迟表明(在CRISPR核心文章的情况下最为明显),这取决于该领域的动态以及编辑在活跃度方面对其的反应。