Department of Neurosciences, University of California, San Diego.
Ontology Engineering Group, Polytechnic University of Madrid.
Am Psychol. 2018 Feb-Mar;73(2):111-125. doi: 10.1037/amp0000242.
Routine data sharing, defined here as the publication of the primary data and any supporting materials required to interpret the data acquired as part of a research study, is still in its infancy in psychology, as in many domains. Nevertheless, with increased scrutiny on reproducibility and more funder mandates requiring sharing of data, the issues surrounding data sharing are moving beyond whether data sharing is a benefit or a bane to science, to what data should be shared and how. Here, we present an overview of these issues, specifically focusing on the sharing of so-called "long tail" data, that is, data generated by individual laboratories as part of largely hypothesis-driven research. We draw on experiences in other domains to discuss attitudes toward data sharing, cost-benefits, best practices and infrastructure. We argue that the publishing of data sets is an integral component of 21st-century scholarship. Moreover, although not all issues around how and what to share have been resolved, a consensus on principles and best practices for effective data sharing and the infrastructure for sharing many types of data are largely in place. (PsycINFO Database Record
常规数据共享,在这里被定义为发布主要数据以及获取研究数据所必需的任何支持材料,在心理学领域,就像在许多领域一样,仍处于起步阶段。尽管如此,随着对可重复性的审查力度越来越大,以及更多的资助者要求共享数据,围绕数据共享的问题已经不再仅仅是数据共享对科学是利是弊,而是应该共享哪些数据以及如何共享。在这里,我们概述了这些问题,特别是专注于所谓的“长尾”数据的共享,即个别实验室在很大程度上基于假设驱动的研究中生成的数据。我们借鉴了其他领域的经验来讨论对数据共享的态度、成本效益、最佳实践和基础设施。我们认为,数据集的发布是 21 世纪学术研究不可或缺的组成部分。此外,尽管如何以及分享什么数据的所有问题尚未得到解决,但在有效的数据共享原则和最佳实践以及共享多种类型数据的基础设施方面已达成广泛共识。