DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley California, USA.
Biological Computation and Process Laboratory, Chemical Process and Energy Resources Institute, Centre for Research and Technology Hellas, Thessalonica, Greece
mBio. 2021 Jan 19;12(1):e03014-20. doi: 10.1128/mBio.03014-20.
Orf8, one of the most puzzling genes in the SARS lineage of coronaviruses, marks a unique and striking difference in genome organization between SARS-CoV-2 and SARS-CoV-1. Here, using sequence comparisons, we unequivocally reveal the distant sequence similarities between SARS-CoV-2 Orf8 with its SARS-CoV-1 counterparts and the X4-like genes of coronaviruses, including its highly divergent "paralog" gene Orf7a, whose product is a potential immune antagonist of known structure. Supervised sequence space walks unravel identity levels that drop below 10% and yet exhibit subtle conservation patterns in this novel superfamily, characterized by an immunoglobulin-like beta sandwich topology. We document the high accuracy of the sequence space walk process in detail and characterize the subgroups of the superfamily in sequence space by systematic annotation of gene and taxon groups. While SARS-CoV-1 Orf7a and Orf8 genes are most similar to bat virus sequences, their SARS-CoV-2 counterparts are closer to pangolin virus homologs, reflecting the fine structure of conservation patterns within the SARS-CoV-2 genomes. The divergence between Orf7a and Orf8 is exceptionally idiosyncratic, since Orf7a is more constrained, whereas Orf8 is subject to rampant change, a peculiar feature that may be related to hitherto-unknown viral infection strategies. Despite their common origin, the Orf7a and Orf8 protein families exhibit different modes of evolutionary trajectories within the coronavirus lineage, which might be partly attributable to their complex interactions with the mammalian host cell, reflected by a multitude of functional associations of Orf8 in SARS-CoV-2 compared to a very small number of interactions discovered for Orf7a. Orf8 is one of the most puzzling genes in the SARS lineage of coronaviruses, including SARS-CoV-2. Using sophisticated sequence comparisons, we confirm its origins from Orf7a, another gene in the lineage that appears as more conserved, compared to Orf8. Orf7a is a potential immune antagonist of known structure, while a deletion of Orf8 was shown to decrease the severity of the infection in a cohort study. The subtle sequence similarities imply that Orf8 has the same immunoglobulin-like fold as Orf7a, confirmed by structure determination. We characterize the subgroups of this superfamily and demonstrate the highly idiosyncratic divergence patterns during the evolution of the virus.
ORF8 是 SARS 冠状病毒谱系中最令人费解的基因之一,它标志着 SARS-CoV-2 与 SARS-CoV-1 在基因组组织上存在独特而显著的差异。在这里,我们使用序列比较,毫不含糊地揭示了 SARS-CoV-2 ORF8 与其 SARS-CoV-1 对应物以及冠状病毒 X4 样基因之间的遥远序列相似性,包括其高度分化的“旁系同源”基因 ORF7a,其产物是一种具有已知结构的潜在免疫拮抗剂。受监督的序列空间漫步揭示了身份水平下降到 10%以下,但在这个新的超家族中表现出微妙的保守模式,其特征是免疫球蛋白样β三明治拓扑结构。我们详细记录了序列空间漫步过程的高精度,并通过对基因和分类群的系统注释,对序列空间中的超家族亚组进行了特征描述。虽然 SARS-CoV-1 的 ORF7a 和 ORF8 基因与蝙蝠病毒序列最相似,但它们的 SARS-CoV-2 对应物更接近穿山甲病毒同源物,反映了 SARS-CoV-2 基因组内保守模式的精细结构。ORF7a 和 ORF8 之间的分化非常特殊,因为 ORF7a 受到更多的限制,而 ORF8 则受到猖獗的变化,这一特殊特征可能与迄今为止未知的病毒感染策略有关。尽管它们有共同的起源,但 ORF7a 和 ORF8 蛋白家族在冠状病毒谱系内表现出不同的进化轨迹模式,这可能部分归因于它们与哺乳动物宿主细胞的复杂相互作用,这反映在 SARS-CoV-2 中 ORF8 有大量的功能关联,而 ORF7a 只有很少的相互作用。ORF8 是 SARS 冠状病毒谱系中最令人费解的基因之一,包括 SARS-CoV-2。使用复杂的序列比较,我们确认其起源于 ORF7a,这是谱系中的另一个基因,与 ORF8 相比,它看起来更保守。ORF7a 是一种具有已知结构的潜在免疫拮抗剂,而在一项队列研究中,ORF8 的缺失被证明会降低感染的严重程度。微妙的序列相似性表明,ORF8 具有与 ORF7a 相同的免疫球蛋白样折叠,这一点通过结构测定得到了证实。我们对这个超家族的亚组进行了特征描述,并在病毒进化过程中展示了高度独特的分化模式。