Willingham A T, Dike S, Cheng J, Manak J R, Bell I, Cheung E, Drenkow J, Dumais E, Duttagupta R, Ganesh M, Ghosh S, Helt G, Nix D, Piccolboni A, Sementchenko V, Tammana H, Kapranov P, Gingeras T R
Affymetrix, Inc., Santa Clara, California 95051, USA.
Cold Spring Harb Symp Quant Biol. 2006;71:101-10. doi: 10.1101/sqb.2006.71.068.
Regions of the genome not coding for proteins or not involved in cis-acting regulatory activities are frequently viewed as lacking in functional value. However, a number of recent large-scale studies have revealed significant regulated transcription of unannotated portions of a variety of plant and animal genomes, allowing a new appreciation of the widespread transcription of large portions of the genome. High-resolution mapping of the sites of transcription of the human and fly genomes has provided an alternative picture of the extent and organization of transcription and has offered insights for biological functions of some of the newly identified unannotated transcripts. Considerable portions of the unannotated transcription observed are developmental or cell-type-specific parts of protein-coding transcripts, often serving as novel, alternative 5' transcriptional start sites. These distal 5' portions are often situated at significant distances from the annotated gene and alternatively join with or ignore portions of other intervening genes to comprise novel unannotated protein-coding transcripts. These data support an interlaced model of the genome in which many regions serve multifunctional purposes and are highly modular in their utilization. This model illustrates the underappreciated organizational complexity of the genome and one of the functional roles of transcription from unannotated portions of the genome.
基因组中不编码蛋白质或不参与顺式作用调控活动的区域,常常被视为缺乏功能价值。然而,最近的一些大规模研究揭示,多种动植物基因组中未注释部分存在显著的调控转录现象,这使人们对基因组大部分区域的广泛转录有了新的认识。人类和果蝇基因组转录位点的高分辨率图谱,展现了转录范围和组织的另一种图景,并为一些新发现的未注释转录本的生物学功能提供了见解。观察到的未注释转录的相当一部分是蛋白质编码转录本的发育或细胞类型特异性部分,常作为新的、可变的5'转录起始位点。这些远端5'部分通常与注释基因相距甚远,并且可选择地与其他中间基因的部分连接或忽略它们,以构成新的未注释蛋白质编码转录本。这些数据支持了一种基因组交错模型,其中许多区域具有多种功能,并且在利用上具有高度模块化。该模型说明了基因组中未被充分认识的组织复杂性以及基因组未注释部分转录的功能作用之一。