在高维干草堆中寻找针：神经科学家的典型相关分析。

Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists.

机构信息

Department of Psychology, University of York, Heslington, York, United Kingdom; Sackler Center for Consciousness Science, University of Sussex, Brighton, United Kingdom.

Department of Psychology, University of York, Heslington, York, United Kingdom.

出版信息

Neuroimage. 2020 Aug 1;216:116745. doi: 10.1016/j.neuroimage.2020.116745. Epub 2020 Apr 8.

DOI:10.1016/j.neuroimage.2020.116745

PMID:32278095

Abstract

The 21st century marks the emergence of "big data" with a rapid increase in the availability of datasets with multiple measurements. In neuroscience, brain-imaging datasets are more commonly accompanied by dozens or hundreds of phenotypic subject descriptors on the behavioral, neural, and genomic level. The complexity of such "big data" repositories offer new opportunities and pose new challenges for systems neuroscience. Canonical correlation analysis (CCA) is a prototypical family of methods that is useful in identifying the links between variable sets from different modalities. Importantly, CCA is well suited to describing relationships across multiple sets of data, such as in recently available big biomedical datasets. Our primer discusses the rationale, promises, and pitfalls of CCA.

摘要

21 世纪标志着“大数据”的出现，具有多种测量手段的数据集可用性迅速增加。在神经科学中，脑成像数据集通常伴随着数十个或数百个行为、神经和基因组水平的表型主体描述符。这种“大数据”存储库的复杂性为系统神经科学提供了新的机会和新的挑战。典型相关分析（CCA）是一种典型的方法家族，可用于识别来自不同模式的变量集之间的联系。重要的是，CCA 非常适合描述多个数据集之间的关系，例如在最近可用的大型生物医学数据集中。我们的入门指南讨论了 CCA 的基本原理、承诺和陷阱。