Bioinformatics Research Centre, Aarhus University, 8000 Aarhus C., Denmark.
Department of Clinical Medicine, Aarhus University, 8200 Aarhus N., Denmark.
Genome Res. 2017 Sep;27(9):1597-1607. doi: 10.1101/gr.218891.116. Epub 2017 Aug 3.
Genes in the major histocompatibility complex (MHC, also known as HLA) play a critical role in the immune response and variation within the extended 4-Mb region shows association with major risks of many diseases. Yet, deciphering the underlying causes of these associations is difficult because the MHC is the most polymorphic region of the genome with a complex linkage disequilibrium structure. Here, we reconstruct full MHC haplotypes from de novo assembled trios without relying on a reference genome and perform evolutionary analyses. We report 100 full MHC haplotypes and call a large set of structural variants in the regions for future use in imputation with GWAS data. We also present the first complete analysis of the recombination landscape in the entire region and show how balancing selection at classical genes have linked effects on the frequency of variants throughout the region.
主要组织相容性复合体(MHC,也称为 HLA)中的基因在免疫反应中起着关键作用,扩展的 4Mb 区域内的变异与许多疾病的主要风险相关。然而,由于 MHC 是基因组中多态性最高的区域,具有复杂的连锁不平衡结构,因此很难确定这些关联的根本原因。在这里,我们从头组装的三体系列中重建完整的 MHC 单倍型,而不依赖参考基因组,并进行进化分析。我们报告了 100 个完整的 MHC 单倍型,并调用了该区域的大量结构变体,以供将来与 GWAS 数据进行内插使用。我们还首次对整个区域的重组景观进行了完整分析,并展示了经典基因中的平衡选择如何对整个区域的变异频率产生连锁效应。