开放生物学家的第二份工作——数据管理员的日常生活。

Daily life in the Open Biologist's second job, as a Data Curator.

作者信息

Scorza Livia C T, Zieliński Tomasz, Kalita Irina, Lepore Alessia, El Karoui Meriem, Millar Andrew J

机构信息

Centre for Engineering Biology and School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, EH9 3BF, UK.

Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, EH9 3JD, UK.

出版信息

Wellcome Open Res. 2024 Dec 5;9:523. doi: 10.12688/wellcomeopenres.22899.1. eCollection 2024.

DOI:10.12688/wellcomeopenres.22899.1

PMID:39360219

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11445645/

Abstract

BACKGROUND

Data reusability is the driving force of the research data life cycle. However, implementing strategies to generate reusable data from the data creation to the sharing stages is still a significant challenge. Even when datasets supporting a study are publicly shared, the outputs are often incomplete and/or not reusable. The FAIR (Findable, Accessible, Interoperable, Reusable) principles were published as a general guidance to promote data reusability in research, but the practical implementation of FAIR principles in research groups is still falling behind. In biology, the lack of standard practices for a large diversity of data types, data storage and preservation issues, and the lack of familiarity among researchers are some of the main impeding factors to achieve FAIR data. Past literature describes biological curation from the perspective of data resources that aggregate data, often from publications.

METHODS

Our team works alongside data-generating, experimental researchers so our perspective aligns with publication authors rather than aggregators. We detail the processes for organizing datasets for publication, showcasing practical examples from data curation to data sharing. We also recommend strategies, tools and web resources to maximize data reusability, while maintaining research productivity.

CONCLUSION

We propose a simple approach to address research data management challenges for experimentalists, designed to promote FAIR data sharing. This strategy not only simplifies data management, but also enhances data visibility, recognition and impact, ultimately benefiting the entire scientific community.

摘要

背景

数据可重用性是研究数据生命周期的驱动力。然而，从数据创建到共享阶段实施生成可重用数据的策略仍然是一项重大挑战。即使支持某项研究的数据集已公开共享，其输出结果往往也不完整和/或不可重用。公平（可查找、可访问、可互操作、可重用）原则作为促进研究中数据可重用性的一般指南发布，但研究团队中公平原则的实际实施仍落后。在生物学领域，大量数据类型缺乏标准做法、数据存储和保存问题以及研究人员缺乏相关知识是实现公平数据的一些主要阻碍因素。过去的文献从汇总数据（通常来自出版物）的数据资源角度描述了生物编目。