Gingeras Thomas R
Affymetrix, Inc., Santa Clara, California 95051, USA.
Genome Res. 2007 Jun;17(6):682-90. doi: 10.1101/gr.6525007.
While the concept of a gene has been helpful in defining the relationship of a portion of a genome to a phenotype, this traditional term may not be as useful as it once was. Currently, "gene" has come to refer principally to a genomic region producing a polyadenylated mRNA that encodes a protein. However, the recent emergence of a large collection of unannotated transcripts with apparently little protein coding capacity, collectively called transcripts of unknown function (TUFs), has begun to blur the physical boundaries and genomic organization of genic regions with noncoding transcripts often overlapping protein-coding genes on the same (sense) and opposite strand (antisense). Moreover, they are often located in intergenic regions, making the genic portions of the human genome an interleaved network of both annotated polyadenylated and nonpolyadenylated transcripts, including splice variants with novel 5' ends extending hundreds of kilobases. This complex transcriptional organization and other recently observed features of genomes argue for the reconsideration of the term "gene" and suggests that transcripts may be used to define the operational unit of a genome.
虽然基因的概念有助于界定基因组的一部分与一种表型之间的关系,但这个传统术语可能不像过去那样有用了。目前,“基因”主要是指产生一种聚腺苷酸化信使核糖核酸(mRNA)的基因组区域,这种mRNA编码一种蛋白质。然而,最近出现了大量未注释的转录本,它们显然几乎没有蛋白质编码能力,统称为功能未知转录本(TUFs),这开始模糊了基因区域的物理边界和基因组组织,非编码转录本常常与同一(正义)链和相反链(反义)上的蛋白质编码基因重叠。此外,它们常常位于基因间区域,使得人类基因组的基因部分成为一个由注释的聚腺苷酸化和非聚腺苷酸化转录本交织而成的网络,包括5'端新颖且延伸数百千碱基的剪接变体。这种复杂的转录组织以及基因组最近观察到的其他特征,促使人们重新考虑“基因”这个术语,并表明转录本可用于定义基因组的操作单元。