Manak J Robert, Dike Sujit, Sementchenko Victor, Kapranov Philipp, Biemar Frederic, Long Jeff, Cheng Jill, Bell Ian, Ghosh Srinka, Piccolboni Antonio, Gingeras Thomas R
Affymetrix, Inc., Santa Clara, California, 95051, USA.
Nat Genet. 2006 Oct;38(10):1151-8. doi: 10.1038/ng1875. Epub 2006 Sep 3.
Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5' exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.
许多动植物基因组的转录程度比目前的注释所预测的要广泛得多。然而,这些未注释转录区域的生物学功能在很大程度上尚不清楚。在黑腹果蝇胚胎发育过程中,检测到的转录核苷酸分别约有7%和23%映射到未注释的基因间区域和内含子区域。基于对协同转录的计算分析,我们保守估计,所有未注释转录序列中有29%作为已充分表征的蛋白质编码基因的遗漏或可变外显子发挥作用。我们估计,基因间转录区域的15.6%作为11.4%的表达蛋白质编码基因所使用的遗漏或可变转录起始位点(TSS)发挥作用。在新鉴定的5'外显子内或附近鉴定P因子突变,为将先前未表征的突变定位到其各自基因提供了一种策略。总体而言,这些数据表明,至少85%的果蝇基因组被转录并加工成代表至少30%果蝇基因组的成熟转录本。