Alexandria Archive Institute, Open Context, San Francisco, CA 94127.
Proc Natl Acad Sci U S A. 2022 Oct 25;119(43):e2109313118. doi: 10.1073/pnas.2109313118. Epub 2022 Oct 17.
Investments in data management infrastructure often seek to catalyze new research outcomes based on the reuse of research data. To achieve the goals of these investments, we need to better understand how data creation and data quality concerns shape the potential reuse of data. The primary audience for this paper centers on scientific domain specialists that create and (re)use datasets documenting archaeological materials. This paper discusses practices that promote data quality in support of more open-ended reuse of data beyond the immediate needs of the creators. We argue that identifier practices play a key, but poorly recognized, role in promoting data quality and reusability. We use specific archaeological examples to demonstrate how the use of globally unique and persistent identifiers can communicate aspects of context, avoid errors and misinterpretations, and facilitate integration and reuse. We then discuss the responsibility of data creators and data reusers to employ identifiers to better maintain the contextual integrity of data, including professional, social, and ethical dimensions.
投资于数据管理基础设施通常旨在基于研究数据的再利用来促进新的研究成果。为了实现这些投资的目标,我们需要更好地了解数据创建和数据质量问题如何影响数据的潜在再利用。本文的主要受众是创建和(重新)使用记录考古材料的数据集的科学领域专家。本文讨论了促进数据质量的实践,以支持超出创建者当前需求的更开放的数据再利用。我们认为,标识符实践在促进数据质量和可重用性方面起着关键但未被充分认识的作用。我们使用具体的考古示例来说明如何使用全球唯一且持久的标识符来传达上下文的各个方面,避免错误和误解,并促进集成和重用。然后,我们讨论了数据创建者和数据再使用者的责任,以利用标识符更好地维护数据的上下文完整性,包括专业、社会和道德层面。