Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA.
Department of Biology, Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
Mol Biol Evol. 2021 Dec 9;38(12):5678-5684. doi: 10.1093/molbev/msab265.
The programmed frameshift element (PFE) rerouting translation from ORF1a to ORF1b is essential for the propagation of coronaviruses. The combination of genomic features that make up PFE-the overlap between the two reading frames, a slippery sequence, as well as an ensemble of complex secondary structure elements-places severe constraints on this region as most possible nucleotide substitution may disrupt one or more of these elements. The vast amount of SARS-CoV-2 sequencing data generated within the past year provides an opportunity to assess the evolutionary dynamics of PFE in great detail. Here, we performed a comparative analysis of all available coronaviral genomic data available to date. We show that the overlap between ORF1a and ORF1b evolved as a set of discrete 7, 16, 22, 25, and 31 nucleotide stretches with a well-defined phylogenetic specificity. We further examined sequencing data from over 1,500,000 complete genomes and 55,000 raw read data sets to demonstrate exceptional conservation and detect signatures of selection within the PFE region.
原核生物移码元件(PFE)重排翻译从 ORF1a 到 ORF1b 对于冠状病毒的传播至关重要。构成 PFE 的基因组特征组合——两个阅读框之间的重叠、滑链序列以及一系列复杂的二级结构元件——对该区域施加了严格的限制,因为大多数可能的核苷酸取代可能会破坏一个或多个这些元件。在过去一年中产生的大量 SARS-CoV-2 测序数据提供了一个机会,可以详细评估 PFE 的进化动态。在这里,我们对迄今为止所有可用的冠状病毒基因组数据进行了比较分析。我们表明,ORF1a 和 ORF1b 之间的重叠作为一组离散的 7、16、22、25 和 31 个核苷酸片段进化而来,具有明确的系统发育特异性。我们进一步检查了来自超过 1500000 个完整基因组和 55000 个原始读取数据集的测序数据,以证明 PFE 区域的高度保守性,并检测到选择的特征。