Glinsky Gennadi V
Institute of Engineering in Medicine, University of California-San Diego
Genome Biol Evol. 2016 Sep 19;8(9):2774-88. doi: 10.1093/gbe/evw185.
Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements.
已经鉴定出数千种候选人类特异性调控序列(HSRS),这支持了一种假说,即人类特有的表型源于基因组调控网络的人类特异性改变。总体而言,一组功能和结构上与大猩猩不同的多种不同HSRS家族可以被定义为人类特异性基因组调控网络的主干。在此,对18364个候选HSRS进行了保守模式分析,要求在人类、黑猩猩和倭黑猩猩序列比对过程中100%的碱基必须重新定位。总共鉴定出5535个候选HSRS,它们具有以下特点:(i)在大猩猩中高度保守;(ii)通过高度保守的祖先DNA的适应性进化而来;(iii)由人类谱系上突变率的加速或与非人类灵长类动物的功能差异所定义。高度保守的祖先DNA途径的适应性进化在机制上似乎与由转座元件的物种特异性扩增驱动的调控DNA片段的进化不同。HSRS的全基因组邻近定位分析表明,一小部分拓扑相关结构域(TAD)包含来自四个不同家族的一半以上的HSRS。富含HSRS且在人类中被称为快速进化TAD(revTAD)的TAD占人类胚胎干细胞(hESC)基因组中3127个TAD的0.8 - 10.3%。RevTAD在人类加速区域、人类特异性转录因子结合位点和重组率的定位之间表现出明显的相关模式。在hESC增强子、灵长类特异性CTCF结合位点、人类特异性RNA聚合酶II结合位点、人类保守缺失(hCONDELs)和H3K4me3峰的revTAD边界内有显著富集,且在额叶前皮质神经元的转录起始位点(TSS)处有人类特异性富集(在所有情况下P < 0.0001)。目前的分析支持这样一种观点,即智人的表型差异是由人类特异性基因组调控网络的进化驱动的,其通过至少两种机制上不同的途径产生调控DNA的差异序列:(i)高度保守的祖先调控DNA片段的重组相关适应性进化;(ii)转座元件的人类特异性插入。