Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany.
Complete Genomics, Inc., San Jose, CA 95112, USA.
Nucleic Acids Res. 2019 Apr 8;47(6):2981-2995. doi: 10.1093/nar/gkz031.
To fully understand human genetic variation and its functional consequences, the specific distribution of variants between the two chromosomal homologues of genes must be known. The 'phase' of variants can significantly impact gene function and phenotype. To assess patterns of phase at large scale, we have analyzed 18 121 autosomal genes in 1092 statistically phased genomes from the 1000 Genomes Project and 184 experimentally phased genomes from the Personal Genome Project. Here we show that genes with cis-configurations of coding variants are more frequent than genes with trans-configurations in a genome, with global cis/trans ratios of ∼60:40. Significant cis-abundance was observed in virtually all genomes in all populations. Moreover, we identified a large group of genes exhibiting cis-configurations of protein-changing variants in excess, so-called 'cis-abundant genes', and a smaller group of 'trans-abundant genes'. These two gene categories were functionally distinguishable, and exhibited strikingly different distributional patterns of protein-changing variants. Underlying these phenomena was a shared set of phase-sensitive genes of importance for adaptation and evolution. This work establishes common patterns of phase as key characteristics of diploid human exomes and provides evidence for their functional significance, highlighting the importance of phase for the interpretation of protein-coding genetic variation and gene function.
要全面了解人类遗传变异及其功能后果,就必须知道基因的两个染色体同源物之间变体的具体分布。变体的“相位”会显著影响基因功能和表型。为了大规模评估相位模式,我们分析了来自 1000 基因组计划的 1092 个具有统计相位的基因组和来自个人基因组计划的 184 个具有实验相位的基因组中的 18121 个常染色体基因。在这里,我们表明,在基因组中,具有顺式构型编码变体的基因比具有反式构型的基因更为常见,全球顺式/反式比约为 60:40。几乎所有人群的所有基因组中都观察到显著的顺式丰度。此外,我们还鉴定出一大组表现出蛋白改变变体顺式构型的基因,即所谓的“顺式丰富基因”,以及一小部分“反式丰富基因”。这两类基因在功能上是可区分的,并且表现出截然不同的蛋白改变变体分布模式。这些现象的背后是一组共享的、对适应和进化很重要的相位敏感基因。这项工作确立了相位的常见模式作为二倍体人类外显子的主要特征,并为其功能意义提供了证据,突出了相位对蛋白编码遗传变异和基因功能解释的重要性。