Department of Microbiology and Immunology, The University of Iowa, Iowa City, Iowa, United States of America.
Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, Iowa, United States of America.
PLoS Comput Biol. 2024 Jun 10;20(6):e1012215. doi: 10.1371/journal.pcbi.1012215. eCollection 2024 Jun.
New sublineages of SARS-CoV-2 variants-of-concern (VOCs) continuously emerge with mutations in the spike glycoprotein. In most cases, the sublineage-defining mutations vary between the VOCs. It is unclear whether these differences reflect lineage-specific likelihoods for mutations at each spike position or the stochastic nature of their appearance. Here we show that SARS-CoV-2 lineages have distinct evolutionary spaces (a probabilistic definition of the sequence states that can be occupied by expanding virus subpopulations). This space can be accurately inferred from the patterns of amino acid variability at the whole-protein level. Robust networks of co-variable sites identify the highest-likelihood mutations in new VOC sublineages and predict remarkably well the emergence of subvariants with resistance mutations to COVID-19 therapeutics. Our studies reveal the contribution of low frequency variant patterns at heterologous sites across the protein to accurate prediction of the changes at each position of interest.
不断出现的 SARS-CoV-2 变异株(VOC)亚系具有刺突糖蛋白突变。在大多数情况下,VOC 之间的亚系定义突变不同。目前尚不清楚这些差异是否反映了每个刺突位置的谱系特异性突变可能性,还是它们出现的随机性。在这里,我们表明 SARS-CoV-2 谱系具有不同的进化空间(扩展病毒亚群可以占据的序列状态的概率定义)。可以从整个蛋白质水平的氨基酸变异性模式中准确推断出该空间。共变位点的稳健网络确定了新 VOC 亚系中最有可能的突变,并很好地预测了对 COVID-19 治疗药物具有耐药性突变的亚变体的出现。我们的研究揭示了跨蛋白质的异源位点的低频变异模式对每个感兴趣位置变化的准确预测的贡献。