VanInsberghe David, Neish Andrew S, Lowen Anice C, Koelle Katia
Department of Biology, Emory University, Atlanta, 1510 Clifton Rd, Atlanta, GA, 30322 USA.
Department of Microbiology and Immunology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA, 30322 USA.
Virus Evol. 2021 Jul 15;7(2):veab059. doi: 10.1093/ve/veab059. eCollection 2021 Sep.
Viral recombination can generate novel genotypes with unique phenotypic characteristics, including transmissibility and virulence. Although the capacity for recombination among betacoronaviruses is well documented, recombination between strains of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has not been characterized in detail. Here, we present a lightweight approach for detecting genomes that are potentially recombinant. This approach relies on identifying the mutations that primarily determine SARS-CoV-2 clade structure and then screening genomes for ones that contain multiple mutational markers from distinct clades. Among the over 537,000 genomes queried that were deposited on GISAID.org prior to 16 February 2021, we detected 1,175 potential recombinant sequences. Using a highly conservative criteria to exclude sequences that may have originated through mutation, we find that at least 30 per cent ( = 358) are likely of recombinant origin. An analysis of deep-sequencing data for these putative recombinants, where available, indicated that the majority are high quality. Additional phylogenetic analysis and the observed co-circulation of predicted parent clades in the geographic regions of exposure further support the feasibility of recombination in this subset of potential recombinants. An analysis of these genomes did not reveal evidence for recombination hotspots in the SARS-CoV-2 genome. While most of the putative recombinant sequences we detected were genetic singletons, a small number of genetically identical or highly similar recombinant sequences were identified in the same geographic region, indicative of locally circulating lineages. Recombinant genomes were also found to have originated from parental lineages with substitutions of concern, including D614G, N501Y, E484K, and L452R. Adjusting for an unequal probability of detecting recombinants derived from different parent clades and for geographic variation in clade abundance, we estimate that at most 0.2-2.5 per cent of circulating viruses in the USA and UK are recombinant. Our identification of a small number of putative recombinants within the first year of SARS-CoV-2 circulation underscores the need to sustain efforts to monitor the emergence of new genotypes generated through recombination.
病毒重组可以产生具有独特表型特征的新基因型,包括传播性和毒力。尽管β冠状病毒之间的重组能力已有充分记录,但严重急性呼吸综合征冠状病毒2(SARS-CoV-2)毒株之间的重组尚未得到详细表征。在此,我们提出了一种轻量级方法来检测可能发生重组的基因组。该方法依赖于识别主要决定SARS-CoV-2进化枝结构的突变,然后筛选包含来自不同进化枝的多个突变标记的基因组。在2021年2月16日前存于GISAID.org上的超过53.7万个查询基因组中,我们检测到1175个潜在重组序列。使用高度保守的标准排除可能因突变产生的序列后,我们发现至少30%(=358个)可能源于重组。对这些推定重组体的深度测序数据(如有)分析表明,大多数质量较高。进一步的系统发育分析以及在暴露地理区域中观察到的预测亲本进化枝的共同传播,进一步支持了这一潜在重组体子集中重组的可行性。对这些基因组的分析未发现SARS-CoV-2基因组中存在重组热点的证据。虽然我们检测到的大多数推定重组序列是遗传单例,但在同一地理区域中鉴定出了少数基因相同或高度相似的重组序列,这表明存在局部传播的谱系。还发现重组基因组起源于具有相关替代的亲本谱系,包括D614G、N501Y、E484K和L452R。考虑到检测来自不同亲本进化枝的重组体的概率不均等以及进化枝丰度的地理差异,我们估计美国和英国循环病毒中最多0.2 - 2.5%是重组体。我们在SARS-CoV-2传播的第一年就鉴定出少数推定重组体,这突出了持续监测通过重组产生的新基因型出现情况的必要性。