Barba Agustin, Dominguez Santiago, Cobas Carlos, Martinsen David P, Romain Charles, Rzepa Henry S, Seoane Felipe
Mestrelab Research, S.L., Feliciano Barrera 9B - Bajo, 15706 Santiago de Compostela, Spain.
David Martinsen Consulting, Rockville, Maryland 20850, United States.
ACS Omega. 2019 Feb 14;4(2):3280-3286. doi: 10.1021/acsomega.8b03005. eCollection 2019 Feb 28.
There is an increasing focus on the part of academic institutions, funding agencies, and publishers, if not researchers themselves, on preservation and sharing of research data. Motivations for sharing include research integrity, replicability, and reuse. One of the barriers to publishing data is the extra work involved in preparing data for publication once a journal article and its supporting information have been completed. In this work, a method is described to generate both human and machine-readable supporting information directly from the primary instrumental data files and to generate the metadata to ensure it is published in accordance with findable, accessible, interoperable, and reusable (FAIR) guidelines. Using this approach, both the human readable supporting information and the primary (raw) data can be submitted simultaneously with little extra effort. Although traditionally the data package would be sent to a journal publisher for publication alongside the article, the data package could also be published independently in an institutional FAIR data repository. Workflows are described that store the data packages and generate metadata appropriate for such a repository. The methods both to generate and to publish the data packages have been implemented for NMR data, but the concept is extensible to other types of spectroscopic data as well.
学术机构、资助机构和出版商(即便不是研究人员自身)越来越关注研究数据的保存和共享。共享的动机包括研究的完整性、可重复性和再利用性。发布数据的障碍之一是,在期刊文章及其辅助信息完成后,为数据发布做准备还需要额外的工作。在这项工作中,描述了一种方法,可直接从原始仪器数据文件生成人类可读和机器可读的辅助信息,并生成元数据,以确保其按照可查找、可访问、可互操作和可重复使用(FAIR)的指导原则进行发布。使用这种方法,只需付出少量额外努力,即可同时提交人类可读的辅助信息和原始(未加工)数据。虽然传统上数据包会与文章一起发送给期刊出版商进行发表,但数据包也可以在机构的FAIR数据存储库中独立发布。文中描述了存储数据包并生成适用于此类存储库的元数据的工作流程。生成和发布数据包的方法已应用于核磁共振(NMR)数据,但该概念也可扩展到其他类型的光谱数据。