Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA.
Nature. 2022 Oct;610(7933):704-712. doi: 10.1038/s41586-022-05275-y. Epub 2022 Oct 12.
Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries.
常见的单核苷酸多态性(SNP)预计可共同解释人类身高表型变异的 40-50%,但确定具体变体和相关区域需要庞大的样本量。在这里,我们利用来自 540 万不同祖先个体的全基因组关联研究数据,表明与身高显著相关的 12111 个独立 SNP 几乎解释了所有常见 SNP 遗传率。这些 SNP 聚集在 7209 个不重叠的基因组片段内,平均大小约为 90kb,覆盖了大约 21%的基因组。独立关联的密度在整个基因组中各不相同,密度增加的区域富含生物学上相关的基因。在样本外估计和预测中,12111 个 SNP(或 HapMap 3 面板中的所有 SNP)在欧洲血统人群中解释了 40%(45%)的表型方差,但在其他血统人群中仅解释了约 10-20%(14-24%)。遗传效应大小、相关区域和基因优先级在不同血统之间相似,表明预测准确性的降低可能是由于连锁不平衡和相关区域内等位基因频率的差异所致。最后,我们表明,即使在需要暗示因果基因和变体的样本量较小的情况下,也可以检测到相关的生物学途径。总体而言,这项研究提供了一个包含大多数常见身高相关变体的特定基因组区域的综合图谱。尽管该图谱在欧洲血统人群中已经饱和,但需要进一步的研究来实现其他血统的等效饱和。
Nat Genet. 2010-6-20
Commun Biol. 2025-8-29
Genes (Basel). 2025-7-29
Commun Med (Lond). 2025-8-14
bioRxiv. 2025-8-5
bioRxiv. 2025-7-15
Nat Genet. 2021-8
Hum Mol Genet. 2021-7-28