Kepler Lenora, Hamins-Puertolas Marco, Rasmussen David A
Bioinformatics Research Center, North Carolina State University, 1 Lampe Drive, Raleigh, NC 27607, USA.
Biomathematics Graduate Program, North Carolina State University, Campus Box 8213, Raleigh, NC 27695, USA.
Virus Evol. 2021 Sep 2;7(2):veab073. doi: 10.1093/ve/veab073. eCollection 2021.
The fitness of a pathogen is a composite phenotype determined by many different factors influencing growth rates both within and between hosts. Determining what factors shape fitness at the host population-level is especially challenging because both intrinsic factors like pathogen genetics and extrinsic factors such as host behavior influence between-host transmission potential. This challenge has been highlighted by controversy surrounding the population-level fitness effects of mutations in the SARS-CoV-2 genome and their relative importance when compared against non-genetic factors shaping transmission dynamics. Building upon phylodynamic birth-death models, we develop a new framework to learn how hundreds of genetic and non-genetic factors have shaped the fitness of SARS-CoV-2. We estimate the fitness effects of all amino acid variants and several structural variants that have circulated in the United States between February 2020 and March 2021 from viral phylogenies. We also estimate how much fitness variation among pathogen lineages is attributable to genetic versus non-genetic factors such as spatial heterogeneity in transmission rates. Before September 2020, most fitness variation between lineages can be explained by background spatial heterogeneity in transmission rates across geographic regions. Starting in late 2020, genetic variation in fitness increased dramatically with the emergence of several new lineages including B.1.1.7, B.1.427, B.1.429 and B.1.526. Our analysis also indicates that genetic variants in less well-explored genomic regions outside of Spike may be contributing significantly to overall fitness variation in the viral population.
病原体的适合度是一种复合表型,由许多影响宿主内和宿主间生长速率的不同因素决定。确定在宿主种群水平上塑造适合度的因素尤其具有挑战性,因为病原体遗传学等内在因素和宿主行为等外在因素都会影响宿主间的传播潜力。围绕严重急性呼吸综合征冠状病毒2(SARS-CoV-2)基因组突变的种群水平适合度效应及其与塑造传播动态的非遗传因素相比的相对重要性的争议,凸显了这一挑战。基于系统发育动力学出生-死亡模型,我们开发了一个新框架,以了解数百种遗传和非遗传因素如何塑造了SARS-CoV-2的适合度。我们从病毒系统发育中估计了2020年2月至2021年3月在美国传播的所有氨基酸变体和几种结构变体的适合度效应。我们还估计了病原体谱系之间适合度变化中,有多少可归因于遗传因素与非遗传因素,如传播速率的空间异质性。在2020年9月之前,谱系间的大多数适合度变化可以通过地理区域间传播速率的背景空间异质性来解释。从2020年末开始,随着包括B.1.1.7、B.1.427、B.1.429和B.1.526在内的几个新谱系的出现,适合度的遗传变异急剧增加。我们的分析还表明,刺突蛋白以外较少被研究的基因组区域中的遗传变体,可能对病毒种群的整体适合度变化有显著贡献。