de Hoon Michiel, Hayashizaki Yoshihide
Genome Exploration Research Group, RIKEN Genomic Sciences Center, Yokohama Institute, Yokohama, Kanagawa, Japan.
Biotechniques. 2008 Apr;44(5):627-8, 630, 632. doi: 10.2144/000112802.
In cap analysis gene expression (CAGE), short ( approximately 20 nucleotide) sequence tags originating from the 5' end of full-length mRNAs are sequenced to identify transcription events on a genome-wide scale. The rapid increase in the throughput of present-day sequencers provides much deeper CAGE tag sequencing, where CAGE tags can be found multiple times for each mRNA in a given experiment. CAGE tag counts can then be used to reliably estimate the cellular concentration of the corresponding mRNA. In contrast to microarray and SAGE expression profiling, CAGE identifies the location of each transcription start site in addition to its expression level. This makes it possible for us to infer a genome-wide network of transcriptional regulation by searching the promoter region surrounding each CAGE-defined transcription start site for potential transcription factor binding sites. Hence, deep CAGE is a unique tool for the construction of a promoter-based network of transcriptional regulation. CAGE-based expression profiling also allows us to identify dynamic promoter usage in time-course experiments and the specific promoter regulated by a given transcription factor in disruption experiments. The sheer size of the short-tag datasets produced by modern sequencers spurs a need for new software development to handle the amount of data generated by next-generation sequencers. In addition, new visualization methods will be needed to represent a promoter-based transcriptional network.
在帽分析基因表达(CAGE)中,对源自全长mRNA 5'端的短(约20个核苷酸)序列标签进行测序,以在全基因组范围内识别转录事件。当今测序仪通量的快速增加使得CAGE标签测序更加深入,在给定实验中,每个mRNA的CAGE标签可以被多次发现。然后,CAGE标签计数可用于可靠地估计相应mRNA的细胞浓度。与微阵列和SAGE表达谱分析不同,CAGE除了能确定每个转录起始位点的表达水平外,还能确定其位置。这使我们能够通过在每个CAGE定义的转录起始位点周围的启动子区域搜索潜在的转录因子结合位点来推断全基因组的转录调控网络。因此,深度CAGE是构建基于启动子的转录调控网络的独特工具。基于CAGE的表达谱分析还使我们能够在时间进程实验中识别动态启动子使用情况,以及在破坏实验中识别由给定转录因子调控的特定启动子。现代测序仪产生的短标签数据集规模庞大,这促使人们需要开发新软件来处理下一代测序仪生成的数据量。此外,还需要新的可视化方法来呈现基于启动子的转录网络。