Department of Genetics and Pathology, Rudbeck laboratory, Uppsala University, SE-751 85 Uppsala, Sweden.
Genome Biol. 2010;11(7):R78. doi: 10.1186/gb-2010-11-7-r78. Epub 2010 Jul 23.
We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions.
Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs. Over 80% of the novel transcribed regions map within or in the vicinity of known genes, and by combining sequencing data with de novo splice predictions we predict several of the novel transcribed regions to be new exons or 3' UTRs. For approximately 350 novel transcribed regions, the corresponding DNA sequence is absent in the human reference genome. The presence of novel transcribed regions in five genes and in one intergenic region is further validated with RT-PCR. Finally, we describe and experimentally validate a putative novel multi-exon gene that belongs to the ATP-cassette transporter gene family. This gene does not appear to be functional in human since one exon is absent from the human genome. In addition to novel exons and UTRs, novel transcribed regions may also stem from different types of noncoding transcripts. We note that expressed repeats and introns from unspliced mRNAs are especially common in our data.
Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.
我们通过对脑和肝 cDNA 的深度测序来描绘黑猩猩转录组,旨在定量表达已知基因,并鉴定新的转录区域。
使用转录的严格标准,我们鉴定了 12843 个表达基因,其中大多数在两种组织中都有发现。我们进一步鉴定了 9826 个新的转录区域,它们不与注释的外显子、mRNA 或 EST 重叠。超过 80%的新转录区域位于已知基因内部或附近,通过将测序数据与从头剪接预测相结合,我们预测其中一些新转录区域为新的外显子或 3'UTR。对于大约 350 个新的转录区域,相应的 DNA 序列在人类参考基因组中不存在。在五个基因和一个基因间区域中存在新的转录区域,这进一步通过 RT-PCR 得到了验证。最后,我们描述并实验验证了一个属于 ATP 盒转运蛋白基因家族的假定新的多外显子基因。由于人类基因组中缺失一个外显子,该基因似乎在人类中没有功能。除了新的外显子和 UTR,新的转录区域也可能来自不同类型的非编码转录本。我们注意到,表达的重复序列和未剪接的 mRNA 内含子在我们的数据中特别常见。
我们的结果扩展了黑猩猩基因目录,增加了大量新的外显子和 3'UTR,因此支持哺乳动物基因注释尚未完成的观点。