Ryder Sean P, Morgan Brittany R, Coskun Peren, Antkowiak Katianna, Massi Francesca
Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA, USA.
Evol Bioinform Online. 2021 May 5;17:11769343211014167. doi: 10.1177/11769343211014167. eCollection 2021.
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has motivated a widespread effort to understand its epidemiology and pathogenic mechanisms. Modern high-throughput sequencing technology has led to the deposition of vast numbers of SARS-CoV-2 genome sequences in curated repositories, which have been useful in mapping the spread of the virus around the globe. They also provide a unique opportunity to observe virus evolution in real time. Here, we evaluate two sets of SARS-CoV-2 genomic sequences to identify emerging variants within structured cis-regulatory elements of the SARS-CoV-2 genome. Overall, 20 variants are present at a minor allele frequency of at least 0.5%. Several enhance the stability of Stem Loop 1 in the 5' untranslated region (UTR), including a group of co-occurring variants that extend its length. One appears to modulate the stability of the frameshifting pseudoknot between ORF1a and ORF1b, and another perturbs a bi-ss molecular switch in the 3'UTR. Finally, 5 variants destabilize structured elements within the 3'UTR hypervariable region, including the S2M (stem loop 2 m) selfish genetic element, raising questions as to the functional relevance of these structures in viral replication. Two of the most abundant variants appear to be caused by RNA editing, suggesting host-viral defense contributes to SARS-CoV-2 genome heterogeneity. Our analysis has implications for the development of therapeutics that target viral cis-regulatory RNA structures or sequences.
严重急性呼吸综合征冠状病毒2(SARS-CoV-2)大流行促使人们广泛努力了解其流行病学和致病机制。现代高通量测序技术已导致大量SARS-CoV-2基因组序列存入经过整理的数据库,这些序列有助于绘制病毒在全球的传播情况。它们还提供了实时观察病毒进化的独特机会。在这里,我们评估了两组SARS-CoV-2基因组序列,以识别SARS-CoV-2基因组结构化顺式调控元件内出现的变异体。总体而言,有20个变异体的次要等位基因频率至少为0.5%。其中一些变异增强了5'非翻译区(UTR)中茎环1的稳定性,包括一组共同出现的延长其长度的变异体。一个变异体似乎调节了ORF1a和ORF1b之间移码假结的稳定性,另一个变异体扰乱了3'UTR中的双分子开关。最后,5个变异体使3'UTR高变区内的结构化元件不稳定,包括S2M(茎环2m)自私遗传元件,这引发了关于这些结构在病毒复制中的功能相关性的问题。两个最丰富的变异体似乎是由RNA编辑引起的,这表明宿主-病毒防御导致了SARS-CoV-2基因组的异质性。我们的分析对靶向病毒顺式调控RNA结构或序列的治疗方法的开发具有启示意义。