Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA.
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Genetics. 2021 Jul 14;218(3). doi: 10.1093/genetics/iyab074.
The ability to detect recombination in pathogen genomes is crucial to the accuracy of phylogenetic analysis and consequently to forecasting the spread of infectious diseases and to developing therapeutics and public health policies. However, in case of the SARS-CoV-2, the low divergence of near-identical genomes sequenced over a short period of time makes conventional analysis infeasible. Using a novel method, we identified 225 anomalous SARS-CoV-2 genomes of likely recombinant origins out of the first 87,695 genomes to be released, several of which have persisted in the population. Bolotie is specifically designed to perform a rapid search for inter-clade recombination events over extremely large datasets, facilitating analysis of novel isolates in seconds. In cases where raw sequencing data were available, we were able to rule out the possibility that these samples represented co-infections by analyzing the underlying sequence reads. The Bolotie software and other data from our study are available at https://github.com/salzberg-lab/bolotie.
检测病原体基因组中重组的能力对于系统发育分析的准确性至关重要,因此对于预测传染病的传播以及开发治疗方法和公共卫生政策至关重要。然而,就 SARS-CoV-2 而言,在短时间内测序的近乎相同基因组的低差异使得常规分析变得不可行。使用一种新方法,我们从最初发布的 87695 个基因组中鉴定出了 225 个可能具有重组来源的异常 SARS-CoV-2 基因组,其中一些在人群中持续存在。Bolotie 专门设计用于在极其庞大的数据集上快速搜索跨谱系重组事件,从而可以在几秒钟内分析新型分离株。在有原始测序数据的情况下,我们能够通过分析潜在的序列读数排除这些样本代表合并感染的可能性。Bolotie 软件和我们研究的其他数据可在 https://github.com/salzberg-lab/bolotie 上获得。