Ge Tian, Nichols Thomas E, Lee Phil H, Holmes Avram J, Roffman Joshua L, Buckner Randy L, Sabuncu Mert R, Smoller Jordan W
Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA 02129; Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02138;
Department of Statistics & Warwick Manufacturing Group, The University of Warwick, Coventry CV4 7AL, United Kingdom;
Proc Natl Acad Sci U S A. 2015 Feb 24;112(8):2479-84. doi: 10.1073/pnas.1415603112. Epub 2015 Feb 9.
The discovery and prioritization of heritable phenotypes is a computational challenge in a variety of settings, including neuroimaging genetics and analyses of the vast phenotypic repositories in electronic health record systems and population-based biobanks. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Genome-wide complex trait analysis is an alternative tool to compute heritability estimates from unrelated individuals, using genome-wide data that are increasingly ubiquitous, but is computationally demanding and becomes difficult to apply in evaluating very large numbers of phenotypes. Here we present a fast and accurate statistical method for high-dimensional heritability analysis using genome-wide SNP data from unrelated individuals, termed massively expedited genome-wide heritability analysis (MEGHA) and accompanying nonparametric sampling techniques that enable flexible inferences for arbitrary statistics of interest. MEGHA produces estimates and significance measures of heritability with several orders of magnitude less computational time than existing methods, making heritability-based prioritization of millions of phenotypes based on data from unrelated individuals tractable for the first time to our knowledge. As a demonstration of application, we conducted heritability analyses on global and local morphometric measurements derived from brain structural MRI scans, using genome-wide SNP data from 1,320 unrelated young healthy adults of non-Hispanic European ancestry. We also computed surface maps of heritability for cortical thickness measures and empirically localized cortical regions where thickness measures were significantly heritable. Our analyses demonstrate the unique capability of MEGHA for large-scale heritability-based screening and high-dimensional heritability profile construction.
可遗传表型的发现与优先级确定在多种情况下都是一项计算挑战,包括神经影像遗传学以及对电子健康记录系统和基于人群的生物样本库中大量表型储存库的分析。传统的遗传力估计需要双胞胎或家系数据,获取这些数据成本高昂且困难。全基因组复杂性状分析是一种从无关个体计算遗传力估计值的替代工具,它使用越来越普遍的全基因组数据,但计算要求很高,在评估大量表型时难以应用。在此,我们提出一种快速且准确的统计方法,用于使用来自无关个体的全基因组SNP数据进行高维遗传力分析,称为大规模加速全基因组遗传力分析(MEGHA)以及伴随的非参数抽样技术,这些技术能够对任意感兴趣的统计量进行灵活推断。与现有方法相比,MEGHA产生遗传力估计值和显著性度量所需的计算时间减少了几个数量级,据我们所知,这使得首次基于无关个体的数据对数百万表型进行基于遗传力的优先级确定变得可行。作为应用示例,我们使用来自1320名非西班牙裔欧洲血统的无关年轻健康成年人的全基因组SNP数据,对源自脑结构MRI扫描的全局和局部形态测量进行了遗传力分析。我们还计算了皮质厚度测量的遗传力表面图,并通过经验定位了厚度测量具有显著遗传性的皮质区域。我们的分析证明了MEGHA在大规模基于遗传力的筛选和高维遗传力图谱构建方面的独特能力。