Milligan Laura, Huynh-Thu Vân A, Delan-Forino Clémentine, Tuck Alex, Petfalski Elisabeth, Lombraña Rodrigo, Sanguinetti Guido, Kudla Grzegorz, Tollervey David
Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
School of Informatics, University of Edinburgh, Edinburgh, UK Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
Mol Syst Biol. 2016 Jun 10;12(6):874. doi: 10.15252/msb.20166869.
Reversible modification of the RNAPII C-terminal domain links transcription with RNA processing and surveillance activities. To better understand this, we mapped the location of RNAPII carrying the five types of CTD phosphorylation on the RNA transcript, providing strand-specific, nucleotide-resolution information, and we used a machine learning-based approach to define RNAPII states. This revealed enrichment of Ser5P, and depletion of Tyr1P, Ser2P, Thr4P, and Ser7P in the transcription start site (TSS) proximal ~150 nt of most genes, with depletion of all modifications close to the poly(A) site. The TSS region also showed elevated RNAPII relative to regions further 3', with high recruitment of RNA surveillance and termination factors, and correlated with the previously mapped 3' ends of short, unstable ncRNA transcripts. A hidden Markov model identified distinct modification states associated with initiating, early elongating and later elongating RNAPII. The initiation state was enriched near the TSS of protein-coding genes and persisted throughout exon 1 of intron-containing genes. Notably, unstable ncRNAs apparently failed to transition into the elongation states seen on protein-coding genes.
RNA聚合酶II C末端结构域的可逆修饰将转录与RNA加工及监控活动联系起来。为了更好地理解这一点,我们绘制了在RNA转录本上携带五种CTD磷酸化类型的RNA聚合酶II的位置,提供了链特异性的核苷酸分辨率信息,并使用基于机器学习的方法来定义RNA聚合酶II的状态。这揭示了在大多数基因转录起始位点(TSS)近端约150个核苷酸处Ser5P富集,而Tyr1P、Ser2P、Thr4P和Ser7P缺失,在靠近聚腺苷酸化位点处所有修饰均缺失。相对于更靠3'端的区域,TSS区域的RNA聚合酶II也有所增加,RNA监控和终止因子高度募集,并且与先前绘制的短的、不稳定的非编码RNA转录本的3'端相关。一个隐马尔可夫模型确定了与起始、早期延伸和后期延伸的RNA聚合酶II相关的不同修饰状态。起始状态在蛋白质编码基因的TSS附近富集,并在含内含子基因的整个外显子1中持续存在。值得注意的是,不稳定的非编码RNA显然未能转变为在蛋白质编码基因上看到的延伸状态。