Ramarao-Milne Priya, Jain Yatish, Sng Letitia M F, Hosking Brendan, Lee Carol, Bayat Arash, Kuiper Michael, Wilson Laurence O W, Twine Natalie A, Bauer Denis C
Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia.
Department of Biomedical Sciences, Macquarie University, New South Wales, Sydney, Australia.
Comput Struct Biotechnol J. 2022;20:2942-2950. doi: 10.1016/j.csbj.2022.06.005. Epub 2022 Jun 3.
New SARS-CoV-2 variants emerge as part of the virus' adaptation to the human host. The Health Organizations are monitoring newly emerging variants with suspected impact on disease or vaccination efficacy as Variants Being Monitored (VBM), like Delta and Omicron. Genetic changes (SNVs) compared to the Wuhan variant characterize VBMs with current emphasis on the spike protein and lineage markers. However, monitoring VBMs in such a way might miss SNVs with functional effect on disease. Here we introduce a lineage-agnostic genome-wide approach to identify SNVs associated with disease. We curated a case-control dataset of 10,520 samples and identified 117 SNVs significantly associated with adverse patient outcome. While 40% (47) SNV are already monitored and 36% (43) are in the spike protein, we also identified 70 new SNVs that are associated with disease outcome. 31 of these are disease-worsening and predominantly located in the 3'-5' exonuclease (NSP14) with structural modelling revealing a concise cluster in the Zn binding domain that has known host-immune modulating function. Furthermore, we generate clade-independent VBM groupings by identifying interacting SNVs (epistasis). We find 37 sets of higher-order epistatic interactions joining 5 genomic regions (nsp3, nsp14, Spike S1, ORF3a, N). Structural modelling of these regions provides insights into potential mechanistic pathways of increased virulence as well as orthogonal methods of validation. Clade-independent monitoring of functionally interacting (epistasis, co-evolution) SNVs detected emerging VBM a week before they were flagged by Health Organizations and in conjunction with structural modelling provides faster, mechanistic insight into emerging strains to guide public health interventions.
新型严重急性呼吸综合征冠状病毒2(SARS-CoV-2)变种的出现是该病毒适应人类宿主的一部分。卫生组织正在监测新出现的、疑似对疾病或疫苗效力有影响的变种,如德尔塔和奥密克戎,将其作为受监测变种(VBM)。与武汉变种相比,基因变化(单核苷酸变异,SNV)是VBM的特征,目前重点关注刺突蛋白和谱系标记。然而,以这种方式监测VBM可能会遗漏对疾病有功能影响的SNV。在这里,我们引入一种不依赖谱系的全基因组方法来识别与疾病相关的SNV。我们精心整理了一个包含10520个样本的病例对照数据集,并确定了117个与患者不良预后显著相关的SNV。虽然40%(47个)的SNV已被监测,36%(43个)位于刺突蛋白中,但我们也发现了70个与疾病结果相关的新SNV。其中31个会使疾病恶化,主要位于3'-5'核酸外切酶(NSP14)中,结构建模显示在锌结合域有一个紧密的簇,该区域具有已知的宿主免疫调节功能。此外,我们通过识别相互作用的SNV(上位性)生成不依赖进化枝的VBM分组。我们发现37组高阶上位性相互作用连接了5个基因组区域(nsp3、nsp14、刺突S1、ORF3a、N)。这些区域的结构建模为毒力增加的潜在机制途径以及正交验证方法提供了见解。对功能相互作用(上位性、共同进化)的SNV进行不依赖进化枝的监测,比卫生组织标记出新兴VBM提前一周检测到它们,并且结合结构建模能够更快地从机制上深入了解新兴毒株,以指导公共卫生干预措施。