Myers Emma M, Bartlett Christopher W, Machiraju Raghu, Bohland Jason W
Graduate Program for Neuroscience, Boston University, Boston, MA 02215, USA.
Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, The Ohio State University, Columbus, OH 43205, USA.
Methods. 2015 Feb;73:54-70. doi: 10.1016/j.ymeth.2014.12.010. Epub 2014 Dec 15.
Studies of the brain's transcriptome have become prominent in recent years, resulting in an accumulation of datasets with somewhat distinct attributes. These datasets, which are often analyzed only in isolation, also are often collected with divergent goals, which are reflected in their sampling properties. While many researchers have been interested in sampling gene expression in one or a few brain areas in a large number of subjects, recent efforts from the Allen Institute for Brain Sciences and others have focused instead on dense neuroanatomical sampling, necessarily limiting the number of individual donor brains studied. The purpose of the present work is to develop methods that draw on the complementary strengths of these two types of datasets for study of the human brain, and to characterize the anatomical specificity of gene expression profiles and gene co-expression networks derived from human brains using different specific technologies. The approach is applied using two publicly accessible datasets: (1) the high anatomical resolution Allen Human Brain Atlas (AHBA, Hawrylycz et al., 2012) and (2) a relatively large sample size, but comparatively coarse neuroanatomical dataset described previously by Gibbs et al. (2010). We found a relatively high degree of correspondence in differentially expressed genes and regional gene expression profiles across the two datasets. Gene co-expression networks defined in individual brain regions were less congruent, but also showed modest anatomical specificity. Using gene modules derived from the Gibbs dataset and from curated gene lists, we demonstrated varying degrees of anatomical specificity based on two classes of methods, one focused on network modularity and the other focused on enrichment of expression levels. Two approaches to assessing the statistical significance of a gene set's modularity in a given brain region were studied, which provide complementary information about the anatomical specificity of a gene network of interest. Overall, the present work demonstrates the feasibility of cross-dataset analysis of human brain microarray studies, and offers a new approach to annotating gene lists in a neuroanatomical context.
近年来,对大脑转录组的研究变得十分突出,积累了一些具有不同属性的数据集。这些数据集通常仅被单独分析,而且收集时的目标也各不相同,这在它们的采样特性中有所体现。虽然许多研究人员一直对在大量受试者的一个或几个脑区中采样基因表达感兴趣,但艾伦脑科学研究所和其他机构最近的工作重点则是密集的神经解剖学采样,这必然限制了所研究的个体供体大脑的数量。本研究的目的是开发利用这两类数据集的互补优势来研究人类大脑的方法,并使用不同的特定技术来表征源自人类大脑的基因表达谱和基因共表达网络的解剖学特异性。该方法应用于两个可公开获取的数据集:(1)高解剖分辨率的艾伦人类脑图谱(AHBA,Hawrylycz等人,2012年)和(2)吉布斯等人(2010年)先前描述的一个样本量相对较大但神经解剖数据集相对粗糙的数据集。我们发现两个数据集之间在差异表达基因和区域基因表达谱方面存在较高程度的对应。在各个脑区定义的基因共表达网络一致性较低,但也显示出适度的解剖学特异性。使用源自吉布斯数据集和精选基因列表的基因模块,我们基于两类方法展示了不同程度的解剖学特异性,一类方法侧重于网络模块性,另一类方法侧重于表达水平的富集。研究了两种评估给定脑区中基因集模块性统计显著性的方法,它们提供了有关感兴趣的基因网络解剖学特异性的互补信息。总体而言,本研究证明了人类脑微阵列研究跨数据集分析的可行性,并提供了一种在神经解剖学背景下注释基因列表的新方法。