National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada.
Department of Integrative Biology & Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, Canada.
Microbiol Spectr. 2024 Apr 2;12(4):e0358423. doi: 10.1128/spectrum.03584-23. Epub 2024 Mar 4.
We conducted an analysis to better understand the potential factors impacting host adaptation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in white-tailed deer, humans, and mink due to the strong evidence of sustained transmission within these hosts. Classification models trained on single nucleotide and amino acid differences between samples effectively identified white-tailed deer-, human-, and mink-derived SARS-CoV-2. For example, the balanced accuracy score of Extremely Randomized Trees classifiers was 0.984 ± 0.006. Eighty-eight commonly identified predictive mutations are found at sites under strong positive and negative selective pressure. A large fraction of sites under selection (86.9%) or identified by machine learning (87.1%) are found in genes other than the spike. Some locations encoded by these gene regions are predicted to be B- and T-cell epitopes or are implicated in modulating the immune response suggesting that host adaptation may involve the evasion of the host immune system, modulation of the class-I major-histocompatibility complex, and the diminished recognition of immune epitopes by CD8+ T cells. Our selection and machine learning analysis also identified that silent mutations, such as C7303T and C9430T, play an important role in discriminating deer-derived samples across multiple clades. Finally, our investigation into the origin of the B.1.641 lineage from white-tailed deer in Canada discovered an additional human sequence from Michigan related to the B.1.641 lineage sampled near the emergence of this lineage. These findings demonstrate that machine-learning approaches can be used in combination with evolutionary genomics to identify factors possibly involved in the cross-species transmission of viruses and the emergence of novel viral lineages.IMPORTANCESevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible virus capable of infecting and establishing itself in human and wildlife populations, such as white-tailed deer. This fact highlights the importance of developing novel ways to identify genetic factors that contribute to its spread and adaptation to new host species. This is especially important since these populations can serve as reservoirs that potentially facilitate the re-introduction of new variants into human populations. In this study, we apply machine learning and phylogenetic methods to uncover biomarkers of SARS-CoV-2 adaptation in mink and white-tailed deer. We find evidence demonstrating that both non-synonymous and silent mutations can be used to differentiate animal-derived sequences from human-derived ones and each other. This evidence also suggests that host adaptation involves the evasion of the immune system and the suppression of antigen presentation. Finally, the methods developed here are general and can be used to investigate host adaptation in viruses other than SARS-CoV-2.
我们进行了一项分析,以更好地了解严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)在白尾鹿、人类和水貂中宿主适应性的潜在因素,因为这些宿主中存在持续传播的有力证据。在样本之间的单核苷酸和氨基酸差异上训练的分类模型有效地识别了白尾鹿、人类和水貂来源的 SARS-CoV-2。例如,极端随机树分类器的平衡准确率为 0.984 ± 0.006。在强正选择和负选择压力下发现了 88 个常见的预测突变。选择或机器学习识别的很大一部分位点(86.9%)位于 Spike 以外的基因中。这些基因区域编码的一些位置被预测为 B 细胞和 T 细胞表位,或与调节免疫反应有关,这表明宿主适应性可能涉及逃避宿主免疫系统、调节 I 类主要组织相容性复合体和减少 CD8+T 细胞对免疫表位的识别。我们的选择和机器学习分析还表明,沉默突变(如 C7303T 和 C9430T)在区分跨多个进化枝的鹿源性样本方面发挥着重要作用。最后,我们对加拿大白尾鹿中 B.1.641 谱系的起源进行的调查发现了来自密歇根州的另一个与 B.1.641 谱系有关的人类序列,该序列与该谱系出现时附近采样的 B.1.641 谱系有关。这些发现表明,机器学习方法可与进化基因组学结合使用,以识别可能参与病毒跨物种传播和新病毒谱系出现的因素。
重要的是,严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)是一种高度传染性的病毒,能够感染和在人类和野生动物种群(如白尾鹿)中建立自己。这一事实突显了开发新方法来识别有助于其传播和适应新宿主物种的遗传因素的重要性。这一点尤其重要,因为这些种群可以作为水库,有可能将新变体重新引入人类种群。在这项研究中,我们应用机器学习和系统发育方法来揭示 SARS-CoV-2 在水貂和白尾鹿中适应的生物标志物。我们有证据表明,非同义突变和沉默突变都可用于区分动物来源的序列和人类来源的序列以及彼此之间的序列。这一证据还表明,宿主适应性涉及到免疫系统的逃避和抗原呈递的抑制。最后,这里开发的方法是通用的,可以用于研究 SARS-CoV-2 以外的病毒的宿主适应性。
Microbiol Spectr. 2024-4-2
Cochrane Database Syst Rev. 2022-1-17
Cochrane Database Syst Rev. 2024-12-16
2025-1
Cochrane Database Syst Rev. 2022-5-6
Cochrane Database Syst Rev. 2023-1-30
Cochrane Database Syst Rev. 2022-5-20
Cochrane Database Syst Rev. 2018-2-6
Nat Comput Sci. 2022-12
Can Commun Dis Rep. 2022-6-9
Emerg Microbes Infect. 2023-12
Front Microbiol. 2023-3-9