Scorza Livia C T, Zieliński Tomasz, Kalita Irina, Lepore Alessia, El Karoui Meriem, Millar Andrew J
Centre for Engineering Biology and School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, EH9 3BF, UK.
Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, EH9 3JD, UK.
Wellcome Open Res. 2024 Dec 5;9:523. doi: 10.12688/wellcomeopenres.22899.1. eCollection 2024.
Data reusability is the driving force of the research data life cycle. However, implementing strategies to generate reusable data from the data creation to the sharing stages is still a significant challenge. Even when datasets supporting a study are publicly shared, the outputs are often incomplete and/or not reusable. The FAIR (Findable, Accessible, Interoperable, Reusable) principles were published as a general guidance to promote data reusability in research, but the practical implementation of FAIR principles in research groups is still falling behind. In biology, the lack of standard practices for a large diversity of data types, data storage and preservation issues, and the lack of familiarity among researchers are some of the main impeding factors to achieve FAIR data. Past literature describes biological curation from the perspective of data resources that aggregate data, often from publications.
Our team works alongside data-generating, experimental researchers so our perspective aligns with publication authors rather than aggregators. We detail the processes for organizing datasets for publication, showcasing practical examples from data curation to data sharing. We also recommend strategies, tools and web resources to maximize data reusability, while maintaining research productivity.
We propose a simple approach to address research data management challenges for experimentalists, designed to promote FAIR data sharing. This strategy not only simplifies data management, but also enhances data visibility, recognition and impact, ultimately benefiting the entire scientific community.
数据可重用性是研究数据生命周期的驱动力。然而,从数据创建到共享阶段实施生成可重用数据的策略仍然是一项重大挑战。即使支持某项研究的数据集已公开共享,其输出结果往往也不完整和/或不可重用。公平(可查找、可访问、可互操作、可重用)原则作为促进研究中数据可重用性的一般指南发布,但研究团队中公平原则的实际实施仍落后。在生物学领域,大量数据类型缺乏标准做法、数据存储和保存问题以及研究人员缺乏相关知识是实现公平数据的一些主要阻碍因素。过去的文献从汇总数据(通常来自出版物)的数据资源角度描述了生物编目。
我们的团队与生成数据的实验研究人员合作,因此我们的视角与出版物作者而非汇总者一致。我们详细介绍了为发表而整理数据集的过程,展示了从数据编目到数据共享的实际示例。我们还推荐了一些策略、工具和网络资源,以在保持研究效率的同时最大化数据可重用性。
我们提出了一种简单的方法来应对实验人员的研究数据管理挑战,旨在促进公平的数据共享。这种策略不仅简化了数据管理,还提高了数据的可见性、认可度和影响力,最终使整个科学界受益。