Queder Nazek, Tien Vivian B, Abraham Sanu Ann, Urchs Sebastian Georg Wenzel, Helmer Karl G, Chaplin Derek, van Erp Theo G M, Kennedy David N, Poline Jean-Baptiste, Grethe Jeffrey S, Ghosh Satrajit S, Keator David B
Department of Psychiatry and Human Behavior, School of Medicine, University of California, Irvine, Irvine, CA, United States.
Department of Neurobiology and Behavior and Center for the Neurobiology of Learning and Memory, University of California, Irvine, Irvine, CA, United States.
Front Neuroinform. 2023 Jul 18;17:1174156. doi: 10.3389/fninf.2023.1174156. eCollection 2023.
The biomedical research community is motivated to share and reuse data from studies and projects by funding agencies and publishers. Effectively combining and reusing neuroimaging data from publicly available datasets, requires the capability to query across datasets in order to identify cohorts that match both neuroimaging and clinical/behavioral data criteria. Critical barriers to operationalizing such queries include, in part, the broad use of undefined study variables with limited or no annotations that make it difficult to understand the data available without significant interaction with the original authors. Using the Brain Imaging Data Structure (BIDS) to organize neuroimaging data has made querying across studies for specific image types possible at scale. However, in BIDS, beyond file naming and tightly controlled imaging directory structures, there are very few constraints on ancillary variable naming/meaning or experiment-specific metadata. In this work, we present NIDM-Terms, a set of user-friendly terminology management tools and associated software to better manage individual lab terminologies and help with annotating BIDS datasets. Using these tools to annotate BIDS data with a Neuroimaging Data Model (NIDM) semantic web representation, enables queries across datasets to identify cohorts with specific neuroimaging and clinical/behavioral measurements. This manuscript describes the overall informatics structures and demonstrates the use of tools to annotate BIDS datasets to perform integrated cross-cohort queries.
资助机构和出版商推动生物医学研究界共享和重复使用研究及项目中的数据。要有效地合并和重复使用来自公开可用数据集的神经影像数据,需要具备跨数据集查询的能力,以便识别符合神经影像和临床/行为数据标准的队列。实施此类查询的关键障碍部分包括广泛使用未定义的研究变量,这些变量的注释有限或没有注释,这使得在不与原始作者进行大量交互的情况下难以理解可用数据。使用脑成像数据结构(BIDS)来组织神经影像数据使得大规模跨研究查询特定图像类型成为可能。然而,在BIDS中,除了文件命名和严格控制的成像目录结构外,对辅助变量命名/含义或特定实验元数据的约束非常少。在这项工作中,我们展示了NIDM-Terms,这是一套用户友好的术语管理工具和相关软件,用于更好地管理各个实验室的术语,并帮助注释BIDS数据集。使用这些工具用神经影像数据模型(NIDM)语义网表示法注释BIDS数据,能够跨数据集进行查询,以识别具有特定神经影像和临床/行为测量的队列。本文描述了整体信息学结构,并展示了使用工具注释BIDS数据集以执行综合跨队列查询的方法。