School of Science, Nanjing University of Posts and Telecommunications, Key Laboratory of Radio and Micro-Nano Electronics of Jiangsu Province, Nanjing 210023, People's Republic of China.
Department of Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, United States of America.
Phys Biol. 2024 Nov 21;22(1). doi: 10.1088/1478-3975/ad9213.
Throughout the course of the SARS-CoV-2 pandemic, genetic variation has contributed to the spread and persistence of the virus. For example, various mutations have allowed SARS-CoV-2 to escape antibody neutralization or to bind more strongly to the receptors that it uses to enter human cells. Here, we compared two methods that estimate the fitness effects of viral mutations using the abundant sequence data gathered over the course of the pandemic. Both approaches are grounded in population genetics theory but with different assumptions. One approach, tQLE, features an epistatic fitness landscape and assumes that alleles are nearly in linkage equilibrium. Another approach, MPL, assumes a simple, additive fitness landscape, but allows for any level of correlation between alleles. We characterized differences in the distributions of fitness values inferred by each approach and in the ranks of fitness values that they assign to sequences across time. We find that in a large fraction of weeks the two methods are in good agreement as to their top-ranked sequences, i.e. as to which sequences observed that week are most fit. We also find that agreement between the ranking of sequences varies with genetic unimodality in the population in a given week.
在 SARS-CoV-2 大流行期间,遗传变异促成了病毒的传播和持续存在。例如,各种突变使 SARS-CoV-2 能够逃避抗体中和,或者更紧密地结合它用来进入人体细胞的受体。在这里,我们比较了两种使用大流行期间收集的丰富序列数据估计病毒突变适应性影响的方法。这两种方法都基于群体遗传学理论,但有不同的假设。一种方法 tQLE 具有上位性适应景观,并假设等位基因几乎处于连锁平衡状态。另一种方法 MPL 假设简单的、加性适应景观,但允许等位基因之间存在任意水平的相关性。我们描述了每种方法推断的适应值分布的差异,以及它们在整个时间内对序列赋予的适应值等级的差异。我们发现,在很大一部分周中,这两种方法在其最佳序列的排名上是一致的,即本周观察到的哪些序列最适应。我们还发现,序列排名的一致性随着特定周中群体的遗传单峰性而变化。