Zhang Zhengdong D, Cayting Philip, Weinstock George, Gerstein Mark
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA.
Mol Biol Evol. 2008 Jan;25(1):131-43. doi: 10.1093/molbev/msm251. Epub 2007 Dec 7.
Transcription factor pseudogenes have not been systematically studied before. Nuclear receptors (NRs) constitute one of the largest groups of transcription factors in animals (e.g., 48 NRs in human). The availability of whole-genome sequences enables a global inventory of the NR pseudogenes in a number of vertebrate model organisms. Here we identify the NR pseudogenes in 8 vertebrate organisms and make our results available online at http://www.pseudogene.org/nr. The assignments reveal that NR pseudogenes as a group have characteristics related to generation and distribution contrary to expectations derived from previous large-scale pseudogene studies. In particular, 1) despite its large size, the NR gene family has only a very small number of pseudogenes in each of the vertebrate genomes examined; 2) despite the low transcription levels of NR genes, except for one, all other NR pseudogenes identified in this study are retropseudogenes; and 3) no duplicated NR pseudogenes are found, contrary to the fact that the NR gene family was expanded through several waves of gene duplication events. Our analyses further reveal a number of interesting aspects of NR pseudogenes. Specifically, through careful sequence analysis, we identify remnant introns in 2 mouse retropseudogenes, psiRev-erbbeta and psiLRH1. Generated from partially processed pre-mRNAs, they appear to be rare examples of highly unusual "semiprocessed" pseudogenes. Second, by comparing the genomic sequences, we uncover a pseudogene that is unique to the human lineage relative to chimpanzee. Generated by a recent duplication of a segment in the human genome, this pseudogene is a "duplicated-processed" pseudogene, belonging to a new pseudogene species. Finally, FXRbeta was nonfunctionalized in the human lineage and thus appears to be an example of a rare unitary pseudogene. By comparing orthologous sequences, we dated the FXR-FXRbeta duplication and the nonfunctionalization of FXRbeta in primates.
转录因子假基因此前尚未得到系统研究。核受体(NRs)是动物体内最大的转录因子家族之一(例如,人类有48种核受体)。全基因组序列的可得性使得对多种脊椎动物模式生物中的核受体假基因进行全面清查成为可能。在此,我们鉴定了8种脊椎动物中的核受体假基因,并将结果公布在网上(http://www.pseudogene.org/nr)。这些鉴定结果显示,作为一个整体,核受体假基因在产生和分布方面具有一些与以往大规模假基因研究得出的预期相反的特征。具体而言,1)尽管核受体基因家族规模庞大,但在所研究的每个脊椎动物基因组中,其假基因数量都非常少;2)尽管核受体基因的转录水平较低,但除了一个之外,本研究中鉴定出的所有其他核受体假基因都是反转录假基因;3)未发现重复的核受体假基因,这与核受体基因家族通过多次基因复制事件得以扩展的事实相悖。我们的分析进一步揭示了核受体假基因的一些有趣方面。具体来说,通过仔细的序列分析,我们在两个小鼠反转录假基因psiRev-erbbeta和psiLRH1中鉴定出了残留内含子。它们由部分加工的前体mRNA产生,似乎是极为罕见的高度异常的“半加工”假基因实例。其次,通过比较基因组序列,我们发现了一个相对于黑猩猩而言人类谱系特有的假基因。该假基因由人类基因组中一段序列的近期复制产生,是一个“复制加工”假基因,属于一种新的假基因类型。最后一点,FXRbeta在人类谱系中失去了功能,因此似乎是一个罕见的单一假基因实例。通过比较直系同源序列,我们确定了灵长类动物中FXR - FXRbeta复制以及FXRbeta失活的时间。