Medical Research Council-University of Glasgow Centre for Virus Research, Glasgow G11 5JR, United Kingdom.
Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):19755-60. doi: 10.1073/pnas.1115861108. Epub 2011 Nov 22.
Deep sequencing was used to bring high resolution to the human cytomegalovirus (HCMV) transcriptome at the stage when infectious virion production is under way, and major findings were confirmed by extensive experimentation using conventional techniques. The majority (65.1%) of polyadenylated viral RNA transcription is committed to producing four noncoding transcripts (RNA2.7, RNA1.2, RNA4.9, and RNA5.0) that do not substantially overlap designated protein-coding regions. Additional noncoding RNAs that are transcribed antisense to protein-coding regions map throughout the genome and account for 8.7% of transcription from these regions. RNA splicing is more common than recognized previously, which was evidenced by the identification of 229 potential donor and 132 acceptor sites, and it affects 58 protein-coding genes. The great majority (94) of 96 splice junctions most abundantly represented in the deep-sequencing data was confirmed by RT-PCR or RACE or supported by involvement in alternative splicing. Alternative splicing is frequent and particularly evident in four genes (RL8A, UL74A, UL124, and UL150A) that are transcribed by splicing from any one of many upstream exons. The analysis also resulted in the annotation of four previously unrecognized protein-coding regions (RL8A, RL9A, UL150A, and US33A), and expression of the UL150A protein was shown in the context of HCMV infection. The overall conclusion, that HCMV transcription is complex and multifaceted, has implications for the potential sophistication of virus functionality during infection. The study also illustrates the key contribution that deep sequencing can make to the genomics of nuclear DNA viruses.
深度测序技术用于在产生感染性病毒粒子的阶段对人类巨细胞病毒 (HCMV) 转录组进行高分辨率研究,主要发现通过广泛使用传统技术的实验得到了证实。大多数(65.1%)多聚腺苷酸化病毒 RNA 转录用于产生四个非编码转录本(RNA2.7、RNA1.2、RNA4.9 和 RNA5.0),它们与指定的蛋白编码区没有实质性重叠。转录本与蛋白编码区反义的其他非编码 RNA 遍布整个基因组,占这些区域转录的 8.7%。RNA 剪接比以前认识到的更为普遍,这一点从鉴定出 229 个潜在的供体和 132 个受体位点得到证明,并且影响 58 个蛋白编码基因。在深度测序数据中最丰富的 96 个剪接接头中的绝大多数(94 个)通过 RT-PCR 或 RACE 得到证实,或者通过参与可变剪接得到支持。可变剪接很常见,在四个基因(RL8A、UL74A、UL124 和 UL150A)中尤为明显,这些基因通过来自许多上游外显子之一的剪接转录。该分析还导致了四个以前未被识别的蛋白编码区(RL8A、RL9A、UL150A 和 US33A)的注释,并在 HCMV 感染的背景下展示了 UL150A 蛋白的表达。总的结论是,HCMV 转录是复杂和多方面的,这对感染期间病毒功能的潜在复杂性有影响。该研究还说明了深度测序对核 DNA 病毒基因组学的关键贡献。