Zehl Lyuba, Jaillet Florent, Stoewer Adrian, Grewe Jan, Sobolev Andrey, Wachtler Thomas, Brochier Thomas G, Riehle Alexa, Denker Michael, Grün Sonja
Institute of Neuroscience and Medicine (INM-6), Institute for Advanced Simulation (IAS-6), JARA BRAIN Institute I, Jülich Research Centre Jülich, Germany.
Laboratoire d'informatique Fondamentale, UMR 7279, Centre National de la Recherche Scientifique, Aix-Marseille UniversitéMarseille, France; Institut de Neurosciences de la Timone, UMR 7289, Centre National de la Recherche Scientifique, Aix-Marseille UniversitéMarseille, France.
Front Neuroinform. 2016 Jul 19;10:26. doi: 10.3389/fninf.2016.00026. eCollection 2016.
To date, non-reproducibility of neurophysiological research is a matter of intense discussion in the scientific community. A crucial component to enhance reproducibility is to comprehensively collect and store metadata, that is, all information about the experiment, the data, and the applied preprocessing steps on the data, such that they can be accessed and shared in a consistent and simple manner. However, the complexity of experiments, the highly specialized analysis workflows and a lack of knowledge on how to make use of supporting software tools often overburden researchers to perform such a detailed documentation. For this reason, the collected metadata are often incomplete, incomprehensible for outsiders or ambiguous. Based on our research experience in dealing with diverse datasets, we here provide conceptual and technical guidance to overcome the challenges associated with the collection, organization, and storage of metadata in a neurophysiology laboratory. Through the concrete example of managing the metadata of a complex experiment that yields multi-channel recordings from monkeys performing a behavioral motor task, we practically demonstrate the implementation of these approaches and solutions with the intention that they may be generalized to other projects. Moreover, we detail five use cases that demonstrate the resulting benefits of constructing a well-organized metadata collection when processing or analyzing the recorded data, in particular when these are shared between laboratories in a modern scientific collaboration. Finally, we suggest an adaptable workflow to accumulate, structure and store metadata from different sources using, by way of example, the odML metadata framework.
迄今为止,神经生理学研究的不可重复性是科学界激烈讨论的一个问题。提高可重复性的一个关键因素是全面收集和存储元数据,即关于实验、数据以及对数据应用的预处理步骤的所有信息,以便能够以一致且简单的方式进行访问和共享。然而,实验的复杂性、高度专业化的分析工作流程以及缺乏如何使用支持软件工具的知识,常常使研究人员难以承担如此详细的文档记录工作。因此,所收集的元数据往往不完整、外人难以理解或含糊不清。基于我们处理各种数据集的研究经验,我们在此提供概念和技术指导,以克服神经生理学实验室在元数据收集、组织和存储方面面临的挑战。通过管理一个复杂实验的元数据的具体示例,该实验从执行行为运动任务的猴子身上获取多通道记录,我们实际展示了这些方法和解决方案的实施情况,希望它们能够推广到其他项目。此外,我们详细介绍了五个用例,这些用例展示了在处理或分析记录数据时,特别是在现代科学合作中实验室之间共享数据时,构建组织良好的元数据集合所带来的好处。最后,我们通过示例使用odML元数据框架,建议了一个适应性工作流程,用于积累、构建和存储来自不同来源的元数据。