Lifespan Informatics and Neuroimaging Center (PennLINC), Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Penn/CHOP Lifespan Brain Institute, Perelman School of Medicine, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA 19104, USA; Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
Penn/CHOP Lifespan Brain Institute, Perelman School of Medicine, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA 19104, USA; Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104, United States.
Neuroimage. 2022 Nov;263:119609. doi: 10.1016/j.neuroimage.2022.119609. Epub 2022 Sep 3.
The Brain Imaging Data Structure (BIDS) is a specification accompanied by a software ecosystem that was designed to create reproducible and automated workflows for processing neuroimaging data. BIDS Apps flexibly build workflows based on the metadata detected in a dataset. However, even BIDS valid metadata can include incorrect values or omissions that result in inconsistent processing across sessions. Additionally, in large-scale, heterogeneous neuroimaging datasets, hidden variability in metadata is difficult to detect and classify. To address these challenges, we created a Python-based software package titled "Curation of BIDS" (CuBIDS), which provides an intuitive workflow that helps users validate and manage the curation of their neuroimaging datasets. CuBIDS includes a robust implementation of BIDS validation that scales to large samples and incorporates DataLad--a version control software package for data--as an optional dependency to ensure reproducibility and provenance tracking throughout the entire curation process. CuBIDS provides tools to help users perform quality control on their images' metadata and identify unique combinations of imaging parameters. Users can then execute BIDS Apps on a subset of participants that represent the full range of acquisition parameters that are present, accelerating pipeline testing on large datasets.
脑影像数据结构 (BIDS) 是一个规范,伴随有一个软件生态系统,旨在为神经影像学数据处理创建可重复和自动化的工作流程。BIDS Apps 可以灵活地根据数据集检测到的元数据构建工作流程。然而,即使是 BIDS 有效的元数据也可能包含不正确的值或遗漏,从而导致跨会话处理不一致。此外,在大规模、异构的神经影像学数据集中,元数据中的隐藏变异性难以检测和分类。为了解决这些挑战,我们创建了一个名为“BIDS 编目”(CuBIDS)的基于 Python 的软件包,它提供了一个直观的工作流程,帮助用户验证和管理他们的神经影像学数据集的编目。CuBIDS 包括一个强大的 BIDS 验证实现,可扩展到大型样本,并将 DataLad(一个用于数据的版本控制系统软件包)作为可选依赖项,以确保整个编目过程的可重复性和来源跟踪。CuBIDS 提供了工具,帮助用户对其图像的元数据进行质量控制,并识别成像参数的独特组合。然后,用户可以在代表存在的全部采集参数的参与者子集上执行 BIDS Apps,从而加速大型数据集上的管道测试。