Laboratory of Neuro Imaging (LONI), David Geffen School of Medicine at UCLA, University of California, Los Angeles, 635 S. Charles Young Drive, Suite 225, Los Angeles, CA, 90095-7334, USA,
Brain Imaging Behav. 2014 Jun;8(2):311-22. doi: 10.1007/s11682-013-9248-x.
The volume, diversity and velocity of biomedical data are exponentially increasing providing petabytes of new neuroimaging and genetics data every year. At the same time, tens-of-thousands of computational algorithms are developed and reported in the literature along with thousands of software tools and services. Users demand intuitive, quick and platform-agnostic access to data, software tools, and infrastructure from millions of hardware devices. This explosion of information, scientific techniques, computational models, and technological advances leads to enormous challenges in data analysis, evidence-based biomedical inference and reproducibility of findings. The Pipeline workflow environment provides a crowd-based distributed solution for consistent management of these heterogeneous resources. The Pipeline allows multiple (local) clients and (remote) servers to connect, exchange protocols, control the execution, monitor the states of different tools or hardware, and share complete protocols as portable XML workflows. In this paper, we demonstrate several advanced computational neuroimaging and genetics case-studies, and end-to-end pipeline solutions. These are implemented as graphical workflow protocols in the context of analyzing imaging (sMRI, fMRI, DTI), phenotypic (demographic, clinical), and genetic (SNP) data.
生物医学数据的数量、多样性和速度呈指数级增长,每年提供数百 petabytes 的新神经影像学和遗传学数据。与此同时,文献中还开发和报告了数以万计的计算算法,以及数千种软件工具和服务。用户需要从数百万台硬件设备中直观、快速、与平台无关地访问数据、软件工具和基础设施。这种信息、科学技术、计算模型和技术进步的爆炸式增长,给数据分析、基于证据的生物医学推断和研究结果的可重复性带来了巨大的挑战。Pipeline 工作流环境为这些异构资源的一致管理提供了基于众包的分布式解决方案。Pipeline 允许多个(本地)客户端和(远程)服务器连接、交换协议、控制执行、监控不同工具或硬件的状态,并以可移植的 XML 工作流形式共享完整的协议。在本文中,我们展示了几个高级计算神经影像学和遗传学案例研究,以及端到端的管道解决方案。这些都是在分析成像(sMRI、fMRI、DTI)、表型(人口统计学、临床)和遗传(SNP)数据的情况下,通过图形工作流协议实现的。