Department of Biochemistry and Molecular Genetics, Human Medical Genetics Program, University of Colorado School of Medicine, Aurora, Colorado 80045, USA.
G3 (Bethesda). 2012 Sep;2(9):977-86. doi: 10.1534/g3.112.003061. Epub 2012 Sep 1.
DUF1220 protein domains exhibit the most extreme human lineage-specific (HLS) copy number increase of any protein coding region in the human genome and have recently been linked to evolutionary and pathological changes in brain size (e.g., 1q21-associated microcephaly). These findings lend support to the view that DUF1220 domain dosage is a key factor in the determination of primate (and human) brain size. Here we analyze 41 animal genomes and present the most complete account to date of the evolutionary history and genome organization of DUF1220 domains and the gene family that encodes them (NBPF). Included among the novel features identified by this analysis is a DUF1220 domain precursor in nonmammalian vertebrates, a unique predicted promoter common to all mammalian NBPF genes, six distinct clades into which DUF1220 sequences can be subdivided, and a previously unknown member of the NBPF gene family (NBPF25). Most importantly, we show that the exceptional HLS increase in DUF1220 copy number (from 102 in our last common ancestor with chimp to 272 in human; an average HLS increase of ~28 copies every million years since the Homo/Pan split) was driven by intragenic domain hyperamplification. This increase primarily involved a 4.7 kb, tandemly repeated three DUF1220 domain unit we have named the HLS DUF1220 triplet, a motif that is a likely candidate to underlie key properties unique to the Homo sapiens brain. Interestingly, all copies of the HLS DUF1220 triplet lie within a human-specific pericentric inversion that also includes the 1q12 C-band, a polymorphic heterochromatin expansion that is unique to the human genome. Both cytogenetic features likely played key roles in the rapid HLS DUF1220 triplet hyperamplification, which is among the most striking genomic changes specific to the human lineage.
DUF1220 蛋白结构域在人类基因组中拥有编码蛋白区域中最极端的人类谱系特异性(HLS)拷贝数增加,并且最近与大脑大小的进化和病理变化有关(例如 1q21 相关的小头症)。这些发现支持了这样一种观点,即 DUF1220 结构域的剂量是决定灵长类动物(和人类)大脑大小的关键因素。在这里,我们分析了 41 种动物基因组,并提供了迄今为止最完整的 DUF1220 结构域和编码它们的基因家族(NBPF)的进化历史和基因组组织的描述。通过这项分析确定的新特征包括非哺乳动物脊椎动物中的 DUF1220 结构域前体、所有哺乳动物 NBPF 基因共有的独特预测启动子、可以细分为六个不同支系的 DUF1220 序列,以及 NBPF 基因家族的一个以前未知的成员(NBPF25)。最重要的是,我们表明,DUF1220 拷贝数的异常 HLS 增加(从我们与黑猩猩的最后一个共同祖先的 102 个增加到人类的 272 个;自人类/黑猩猩分裂以来,每个百万年 HLS 增加约 28 个拷贝)是由基因内结构域超扩增驱动的。这种增加主要涉及一个 4.7kb 的串联重复的三个 DUF1220 结构域单元,我们将其命名为 HLS DUF1220 三联体,这一基序可能是人类大脑特有的关键特性的候选者。有趣的是,HLS DUF1220 三联体的所有拷贝都位于一个人类特异性着丝粒内倒位中,该倒位还包括 1q12 C 带,这是人类基因组中特有的多态性异染色质扩展。这两个细胞遗传学特征可能在 HLS DUF1220 三联体的快速超扩增中发挥了关键作用,这是人类谱系中最显著的基因组变化之一。