School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, P. R. China.
Clinical Research Institute, The First People's Hospital of FoShan (Affiliated FoShan Hospital of Sun Yat-sen University), 528000, China.
Genetics. 2018 Jun;209(2):579-589. doi: 10.1534/genetics.118.301028. Epub 2018 Apr 18.
It has been challenging to determine the disease-causing variant(s) for most major histocompatibility complex (MHC)-associated diseases. However, it is becoming increasingly clear that regulatory variation is pervasive and a fundamentally important mechanism governing phenotypic diversity and disease susceptibility. We gathered DNase I data from 136 human cells to characterize the regulatory landscape of the MHC region, including 4867 DNase I hypersensitive sites (DHSs). We identified thousands of regulatory elements that have been gained or lost in the human or chimpanzee genomes since their evolutionary divergence. We compared alignments of the DHS across six primates and found 149 DHSs with convincing evidence of positive and/or purifying selection. Of these DHSs, compared to neutral sequences, 24 evolved rapidly in the human lineage. We identified 15 instances of transcription-factor-binding motif gains, such as , , , , , , , and observed 16 GWAS (genome-wide association study) SNPs associated with diseases within these 24 DHSs using FIMO (Find Individual Motif Occurrences) and UCSC (University of California, Santa Cruz) ChIP-seq data. Combining eQTL and Hi-C data, our results indicated that there were five SNPs located in human gains motifs affecting the corresponding gene's expression, two of which closely matched DHS target genes. In addition, a significant SNP, rs7756521, at genome-wide significant level likely affects DDR expression and represents a causal genetic variant for HIV-1 control. These results indicated that species-specific motif gains or losses of rapidly evolving DHSs in the primate genomes might play a role during adaptation evolution and provided some new evidence for a potentially causal role for these GWAS SNPs.
确定大多数主要组织相容性复合体 (MHC) 相关疾病的致病变异一直具有挑战性。然而,越来越明显的是,调控变异是普遍存在的,是控制表型多样性和疾病易感性的根本重要机制。我们从 136 个人类细胞中收集了 DNA 酶 I 数据,以表征 MHC 区域的调控景观,包括 4867 个 DNA 酶 I 超敏位点 (DHS)。我们鉴定了数千个在人类或黑猩猩基因组进化分歧后获得或丢失的调控元件。我们比较了六个灵长类动物的 DHS 比对,发现了 149 个 DHS 具有令人信服的正选择和/或纯化选择证据。在这些 DHS 中,与中性序列相比,人类谱系中 24 个 DHS 进化迅速。我们确定了 15 个转录因子结合基序获得的实例,例如 、 、 、 、 、 、 ,并且使用 FIMO(Find Individual Motif Occurrences)和 UCSC(加利福尼亚大学圣克鲁兹分校)ChIP-seq 数据观察到了 16 个与这些 24 个 DHS 内疾病相关的 GWAS(全基因组关联研究)SNP。结合 eQTL 和 Hi-C 数据,我们的结果表明,有五个 SNP 位于人类获得的基序中,这些 SNP 影响相应基因的表达,其中两个 SNP 与 DHS 靶基因密切匹配。此外,一个显著的 SNP,rs7756521,在全基因组显著水平上可能影响 DDR 的表达,并且代表 HIV-1 控制的因果遗传变异。这些结果表明,灵长类动物基因组中快速进化的 DHS 的种特异性基序获得或丢失可能在适应进化过程中发挥作用,并为这些 GWAS SNP 提供了潜在因果作用的一些新证据。