Interdepartmental Neuroscience Program, Yale School of Medicine, New Haven, CT, USA.
MD/PhD program, Yale School of Medicine, New Haven, CT, USA.
Nat Hum Behav. 2021 Feb;5(2):185-193. doi: 10.1038/s41562-020-01005-4. Epub 2020 Dec 7.
Large datasets that enable researchers to perform investigations with unprecedented rigor are growing increasingly common in neuroimaging. Due to the simultaneous increasing popularity of open science, these state-of-the-art datasets are more accessible than ever to researchers around the world. While analysis of these samples has pushed the field forward, they pose a new set of challenges that might cause difficulties for novice users. Here we offer practical tips for working with large datasets from the end-user's perspective. We cover all aspects of the data lifecycle: from what to consider when downloading and storing the data to tips on how to become acquainted with a dataset one did not collect and what to share when communicating results. This manuscript serves as a practical guide one can use when working with large neuroimaging datasets, thus dissolving barriers to scientific discovery.
在神经影像学中,越来越多的大型数据集使研究人员能够以前所未有的严谨性进行研究。由于开放科学的日益普及,这些最先进的数据集比以往任何时候都更容易被世界各地的研究人员获取。虽然对这些样本的分析推动了该领域的发展,但它们也带来了一系列新的挑战,可能会给新手用户带来困难。在这里,我们从终端用户的角度提供了处理大型数据集的实用技巧。我们涵盖了数据生命周期的各个方面:从下载和存储数据时需要考虑的事项,到如何熟悉自己没有收集的数据的提示,以及在交流结果时需要分享的内容。本文档可作为处理大型神经影像学数据集时的实用指南,从而消除科学发现的障碍。