Ruiz-Orera Jorge, Hernandez-Rodriguez Jessica, Chiva Cristina, Sabidó Eduard, Kondova Ivanela, Bontrop Ronald, Marqués-Bonet Tomàs, Albà M Mar
Evolutionary Genomics Group, Hospital del Mar Research Institute (IMIM), Barcelona, Spain.
Department of Experimental and Health Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain.
PLoS Genet. 2015 Dec 31;11(12):e1005721. doi: 10.1371/journal.pgen.1005721. eCollection 2015 Dec.
The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.
新基因的诞生是进化创新的重要驱动力。许多新基因通过基因复制产生,而其他一些则起源于不含任何基因或基因拷贝的基因组区域。其中一些新表达的基因可能获得编码或非编码功能,并通过自然选择得以保留。然而,从头起源基因出现的普遍性和潜在机制仍不清楚。为了全面了解这一过程,我们对四种哺乳动物——人类、黑猩猩、猕猴和小鼠——的转录组进行了深度测序,随后比较了组装好的转录本和相应的同线基因组区域。这使得我们在人类和/或黑猩猩中鉴定出了五千多个在其他物种中未观察到的新的多外显子转录事件。通过比较基因组学,我们发现这些转录本的表达与转录起始位点(TSS)上游调控基序的获得以及TSS下游U1 snRNP位点的获得有关。总体而言,这些转录本几乎没有纯化选择的证据,这表明它们中的许多没有功能。然而,我们在一部分有蛋白质翻译证据的从头起源基因中发现了选择的特征。综上所述,这些数据支持了一个模型:基因组中频繁出现的新转录事件为新蛋白质的进化提供了原材料。