Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Division of Biostatistics, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Am J Hum Genet. 2019 Apr 4;104(4):611-624. doi: 10.1016/j.ajhg.2019.02.008. Epub 2019 Mar 21.
Regulatory elements, e.g., enhancers and promoters, have been widely reported to be enriched for disease and complex trait heritability. We investigated how this enrichment varies with the age of the underlying genome sequence, the conservation of regulatory function across species, and the target gene of the regulatory element. We estimated heritability enrichment by applying stratified LD score regression to summary statistics from 41 independent diseases and complex traits (average N = 320K) and meta-analyzing results across traits. Enrichment of human putative enhancers and promoters was larger in elements with older sequence age, assessed via alignment with other species irrespective of conserved functionality: putative enhancer elements with ancient sequence age (older than the split between marsupial and placental mammals) were 8.8× enriched (versus 2.5× for all putative enhancers; p = 3e-14), and promoter elements with ancient sequence age were 13.5× enriched (versus 5.1× for all promoters; p = 5e-16). Enrichment of human putative enhancers and promoters was also larger in elements whose regulatory function was conserved across species, e.g., human putative enhancers that were enhancers in ≥5 of 9 other mammals were 4.6× enriched (p = 5e-12 versus all putative enhancers). Enrichment of human promoters was larger in promoters of loss-of-function intolerant genes: 12.0× enrichment (p = 8e-15 versus all promoters). The mean value of several measures of negative selection within these genomic annotations mirrored all of these findings. Notably, the annotations with these excess heritability enrichments were jointly significant conditional on each other and on our baseline-LD model, which includes a broad set of coding, conserved, regulatory, and LD-related annotations.
调控元件,例如增强子和启动子,已被广泛报道与疾病和复杂性状的遗传率有关。我们研究了这种富集如何随潜在基因组序列的年龄、跨物种的调控功能的保守性以及调控元件的靶基因而变化。我们通过对 41 个独立疾病和复杂性状(平均 N = 320K)的汇总统计数据应用分层 LD 得分回归,并对性状之间的结果进行荟萃分析,来估计遗传率富集。通过与其他物种的比对(无论保守功能如何)来评估调控元件的序列年龄,发现人类假定的增强子和启动子的富集程度更大:古老序列年龄(早于有袋类和胎盘类哺乳动物的分化)的假定增强子元件富集了 8.8 倍(而所有假定增强子元件的富集为 2.5 倍;p = 3e-14),古老序列年龄的启动子元件富集了 13.5 倍(而所有启动子元件的富集为 5.1 倍;p = 5e-16)。在跨物种保守的调控元件中,人类假定的增强子和启动子的富集程度也更大,例如,在 9 种其他哺乳动物中至少有 5 种为增强子的人类假定增强子元件富集了 4.6 倍(p = 5e-12 与所有假定增强子元件相比)。功能丧失不耐受基因的启动子的富集程度更大:富集了 12.0 倍(p = 8e-15 与所有启动子元件相比)。这些基因组注释中几种负选择度量的平均值反映了所有这些发现。值得注意的是,这些具有额外遗传率富集的注释在彼此之间以及与我们的基线 LD 模型(包括广泛的编码、保守、调控和 LD 相关注释)都是联合显著的。