Suppr超能文献

SciSciNet:科学学研究的大规模开放数据湖。

SciSciNet: A large-scale open data lake for the science of science research.

机构信息

Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA.

Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.

出版信息

Sci Data. 2023 Jun 1;10(1):315. doi: 10.1038/s41597-023-02198-9.

Abstract

The science of science has attracted growing research interests, partly due to the increasing availability of large-scale datasets capturing the innerworkings of science. These datasets, and the numerous linkages among them, enable researchers to ask a range of fascinating questions about how science works and where innovation occurs. Yet as datasets grow, it becomes increasingly difficult to track available sources and linkages across datasets. Here we present SciSciNet, a large-scale open data lake for the science of science research, covering over 134M scientific publications and millions of external linkages to funding and public uses. We offer detailed documentation of pre-processing steps and analytical choices in constructing the data lake. We further supplement the data lake by computing frequently used measures in the literature, illustrating how researchers may contribute collectively to enriching the data lake. Overall, this data lake serves as an initial but useful resource for the field, by lowering the barrier to entry, reducing duplication of efforts in data processing and measurements, improving the robustness and replicability of empirical claims, and broadening the diversity and representation of ideas in the field.

摘要

科学学吸引了越来越多的研究兴趣,部分原因是越来越多的大规模数据集可用于捕捉科学的内部运作。这些数据集及其众多的相互关联,使研究人员能够提出一系列关于科学如何运作以及创新发生在哪里的引人入胜的问题。然而,随着数据集的增长,越来越难以跟踪可用的数据源和数据集之间的链接。在这里,我们展示了 SciSciNet,这是一个用于科学学研究的大规模开放数据湖,涵盖了超过 1.34 亿篇科学出版物以及数百万条与资金和公共用途的外部链接。我们详细记录了构建数据湖的预处理步骤和分析选择。我们进一步通过计算文献中常用的度量标准来补充数据湖,说明研究人员如何共同为丰富数据湖做出贡献。总体而言,该数据湖通过降低进入门槛、减少数据处理和度量标准的重复工作、提高实证主张的稳健性和可复制性以及拓宽领域内思想的多样性和代表性,为该领域提供了一个初步但有用的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a7d/10235093/54b9672e71e2/41597_2023_2198_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验