Regeneron Genetics Center, Tarrytown, NY, USA.
Nature. 2021 Nov;599(7886):628-634. doi: 10.1038/s41586-021-04103-z. Epub 2021 Oct 18.
A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.
人类遗传学的主要目标是利用自然变异来了解改变基因组中每个蛋白质编码基因的表型后果。在这里,我们使用外显子组测序在英国生物库研究的 454,787 名参与者中探索蛋白质改变的变体及其后果。我们确定了 1200 万个编码变体,包括约 100 万个功能丧失和约 180 万个有害错义变体。当这些变体与 3994 种与健康相关的特征进行关联测试时,我们发现了 564 个具有特征关联的基因,其 P 值 ≤ 2.18×10-8。罕见变体关联在全基因组关联研究(GWAS)的基因座中得到了富集,但大多数(91%)与常见变体信号无关。我们发现了与肝脏疾病、眼病和癌症等相关特征的几个风险增加关联,以及与高血压(SLC9A3R2)、糖尿病(MAP3K15、FAM234A)和哮喘(SLC27A3)等相关的风险降低关联。六个基因与大脑成像表型相关,包括两个涉及神经发育的基因(GBE1、PLD1)。在可用于并在独立队列中进行复制的信号中,有 81%得到了确认;此外,关联信号在具有欧洲、亚洲和非洲血统的个体中通常是一致的。我们展示了外显子组测序在大规模识别基因-特征关联、阐明基因功能和确定 GWAS 信号的效应基因方面的能力。