Bigdely-Shamlo Nima, Makeig Scott, Robbins Kay A
Qusp Labs, Qusp, San Diego CA, USA.
Swartz Center for Computational Neuroscience, University of California, San Diego, San Diego CA, USA.
Front Neuroinform. 2016 Mar 8;10:7. doi: 10.3389/fninf.2016.00007. eCollection 2016.
Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain-computer interface models. However, the absence of standardized vocabularies for annotating events in a machine understandable manner, the welter of collection-specific data organizations, the difficulty in moving data across processing platforms, and the unavailability of agreed-upon standards for preprocessing have prevented large-scale analyses of EEG. Here we describe a "containerized" approach and freely available tools we have developed to facilitate the process of annotating, packaging, and preprocessing EEG data collections to enable data sharing, archiving, large-scale machine learning/data mining and (meta-)analysis. The EEG Study Schema (ESS) comprises three data "Levels," each with its own XML-document schema and file/folder convention, plus a standardized (PREP) pipeline to move raw (Data Level 1) data to a basic preprocessed state (Data Level 2) suitable for application of a large class of EEG analysis methods. Researchers can ship a study as a single unit and operate on its data using a standardized interface. ESS does not require a central database and provides all the metadata data necessary to execute a wide variety of EEG processing pipelines. The primary focus of ESS is automated in-depth analysis and meta-analysis EEG studies. However, ESS can also encapsulate meta-information for the other modalities such as eye tracking, that are increasingly used in both laboratory and real-world neuroimaging. ESS schema and tools are freely available at www.eegstudy.org and a central catalog of over 850 GB of existing data in ESS format is available at studycatalog.org. These tools and resources are part of a larger effort to enable data sharing at sufficient scale for researchers to engage in truly large-scale EEG analysis and data mining (BigEEG.org).
对脑电图(EEG)及其他生理测量指标进行大规模分析,有望为大脑过程带来新的见解,并建立更准确、更强大的脑机接口模型。然而,缺乏以机器可理解的方式注释事件的标准化词汇表、特定采集数据组织的混乱、跨处理平台移动数据的困难以及预处理方面缺乏公认标准,阻碍了对EEG的大规模分析。在此,我们描述了一种“容器化”方法以及我们开发的免费工具,以促进对EEG数据采集进行注释、打包和预处理的过程,从而实现数据共享、存档、大规模机器学习/数据挖掘以及(元)分析。EEG研究模式(ESS)包括三个数据“级别”,每个级别都有自己的XML文档模式和文件/文件夹约定,外加一个标准化(PREP)管道,用于将原始(数据级别1)数据转换为适合一大类EEG分析方法应用的基本预处理状态(数据级别2)。研究人员可以将一项研究作为一个单元进行交付,并使用标准化接口对其数据进行操作。ESS不需要中央数据库,并提供执行各种EEG处理管道所需的所有元数据。ESS的主要重点是EEG研究的自动化深入分析和元分析。然而,ESS还可以封装其他模态(如眼动追踪)的元信息,这些模态在实验室和现实世界的神经成像中越来越常用。ESS模式和工具可在www.eegstudy.org免费获取,studycatalog.org上提供了一个超过850GB的现有ESS格式数据的中央目录。这些工具和资源是一项更大努力的一部分,该努力旨在实现足够规模的数据共享,以使研究人员能够进行真正的大规模EEG分析和数据挖掘(BigEEG.org)。