低频变异的功能结构凸显了负选择在编码和非编码注释上的强大作用。
Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations.
机构信息
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
出版信息
Nat Genet. 2018 Nov;50(11):1600-1607. doi: 10.1038/s41588-018-0231-8. Epub 2018 Oct 8.
Common variant heritability has been widely reported to be concentrated in variants within cell-type-specific non-coding functional annotations, but little is known about low-frequency variant functional architectures. We partitioned the heritability of both low-frequency (0.5%≤ minor allele frequency <5%) and common (minor allele frequency ≥5%) variants in 40 UK Biobank traits across a broad set of functional annotations. We determined that non-synonymous coding variants explain 17 ± 1% of low-frequency variant heritability ([Formula: see text]) versus 2.1 ± 0.2% of common variant heritability ([Formula: see text]). Cell-type-specific non-coding annotations that were significantly enriched for [Formula: see text] of corresponding traits were similarly enriched for [Formula: see text] for most traits, but more enriched for brain-related annotations and traits. For example, H3K4me3 marks in brain dorsolateral prefrontal cortex explain 57 ± 12% of [Formula: see text] versus 12 ± 2% of [Formula: see text] for neuroticism. Forward simulations confirmed that low-frequency variant enrichment depends on the mean selection coefficient of causal variants in the annotation, and can be used to predict effect size variance of causal rare variants (minor allele frequency <0.5%).
常见变异的遗传率已被广泛报道集中在细胞类型特异性非编码功能注释中的变异中,但对低频变异的功能结构知之甚少。我们在广泛的功能注释集上将低频(0.5%≤ 次要等位基因频率 <5%)和常见(次要等位基因频率≥5%)变异的遗传率在 40 个 UK Biobank 特征中进行了划分。我们确定,非同义编码变异解释了低频变异遗传率的 17±1%([公式:见正文]),而常见变异遗传率的 2.1±0.2%([公式:见正文])。与相应特征的[公式:见正文]显著富集的细胞类型特异性非编码注释,对于大多数特征来说,也同样富集,但与大脑相关的注释和特征更为丰富。例如,大脑背外侧前额叶皮层中的 H3K4me3 标记解释了神经质的[公式:见正文]的 57±12%,而解释了[公式:见正文]的 12±2%。正向模拟证实,低频变异的富集取决于注释中因果变异的平均选择系数,并且可以用于预测因果罕见变异(次要等位基因频率 <0.5%)的效应大小方差。