Department of Biology and Ecology, University of Ostrava, Ostrava 710 00, Czech Republic.
Department of Physics, University of Ostrava, Ostrava 710 00, Czech Republic.
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac045.
SARS-CoV-2 is a novel positive-sense single-stranded RNA virus from the Coronaviridae family (genus Betacoronavirus), which has been established as causing the COVID-19 pandemic. The genome of SARS-CoV-2 is one of the largest among known RNA viruses, comprising of at least 26 known protein-coding loci. Studies thus far have outlined the coding capacity of the positive-sense strand of the SARS-CoV-2 genome, which can be used directly for protein translation. However, it has been recently shown that transcribed negative-sense viral RNA intermediates that arise during viral genome replication from positive-sense viruses can also code for proteins. No studies have yet explored the potential for negative-sense SARS-CoV-2 RNA intermediates to contain protein-coding loci. Thus, using sequence and structure-based bioinformatics methodologies, we have investigated the presence and validity of putative negative-sense ORFs (nsORFs) in the SARS-CoV-2 genome. Nine nsORFs were discovered to contain strong eukaryotic translation initiation signals and high codon adaptability scores, and several of the nsORFs were predicted to interact with RNA-binding proteins. Evolutionary conservation analyses indicated that some of the nsORFs are deeply conserved among related coronaviruses. Three-dimensional protein modeling revealed the presence of higher order folding among all putative SARS-CoV-2 nsORFs, and subsequent structural mimicry analyses suggest similarity of the nsORFs to DNA/RNA-binding proteins and proteins involved in immune signaling pathways. Altogether, these results suggest the potential existence of still undescribed SARS-CoV-2 proteins, which may play an important role in the viral lifecycle and COVID-19 pathogenesis.
SARS-CoV-2 是一种新型的正链单链 RNA 病毒,属于冠状病毒科(β属冠状病毒),已被确定为导致 COVID-19 大流行的病原体。SARS-CoV-2 的基因组是已知 RNA 病毒中最大的之一,至少包含 26 个已知的蛋白编码基因座。迄今为止的研究已经概述了 SARS-CoV-2 基因组正链的编码能力,该正链可直接用于蛋白翻译。然而,最近的研究表明,在正链病毒复制过程中产生的转录负链病毒 RNA 中间体也可以编码蛋白。目前尚无研究探讨负链 SARS-CoV-2 RNA 中间体是否具有编码蛋白的能力。因此,我们使用基于序列和结构的生物信息学方法,研究了 SARS-CoV-2 基因组中潜在的负链 RNA 中间体是否存在蛋白编码基因座。发现 9 个 nsORFs 含有强烈的真核翻译起始信号和高密码子适应性评分,并且几个 nsORFs 被预测与 RNA 结合蛋白相互作用。进化保守性分析表明,一些 nsORFs 在相关冠状病毒中深度保守。三维蛋白质建模揭示了所有推定的 SARS-CoV-2 nsORFs 中存在更高阶折叠,随后的结构模拟分析表明,nsORFs 与 DNA/RNA 结合蛋白和参与免疫信号通路的蛋白具有相似性。总之,这些结果表明可能存在尚未描述的 SARS-CoV-2 蛋白,这些蛋白可能在病毒生命周期和 COVID-19 发病机制中发挥重要作用。