Data Science Institute and School of Computer Science, Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW 2007, Australia.
School of Life Sciences, Faculty of Science, University of Technology Sydney, Ultimo, NSW 2007, Australia.
Viruses. 2023 Apr 26;15(5):1065. doi: 10.3390/v15051065.
The COVID-19 pandemic caused by SARS-CoV-2 has had a severe impact on people worldwide. The reference genome of the virus has been widely used as a template for designing mRNA vaccines to combat the disease. In this study, we present a computational method aimed at identifying co-existing intra-host strains of the virus from RNA-sequencing data of short reads that were used to assemble the original reference genome. Our method consisted of five key steps: extraction of relevant reads, error correction for the reads, identification of within-host diversity, phylogenetic study, and protein binding affinity analysis. Our study revealed that multiple strains of SARS-CoV-2 can coexist in both the viral sample used to produce the reference sequence and a wastewater sample from California. Additionally, our workflow demonstrated its capability to identify within-host diversity in foot-and-mouth disease virus (FMDV). Through our research, we were able to shed light on the binding affinity and phylogenetic relationships of these strains with the published SARS-CoV-2 reference genome, SARS-CoV, variants of concern (VOC) of SARS-CoV-2, and some closely related coronaviruses. These insights have important implications for future research efforts aimed at identifying within-host diversity, understanding the evolution and spread of these viruses, as well as the development of effective treatments and vaccines against them.
由 SARS-CoV-2 引起的 COVID-19 大流行对全球人民造成了严重影响。病毒的参考基因组被广泛用作设计 mRNA 疫苗以对抗该疾病的模板。在这项研究中,我们提出了一种计算方法,旨在从用于组装原始参考基因组的短读 RNA 测序数据中识别病毒的共存种内株。我们的方法包括五个关键步骤:相关读取的提取、读取的纠错、种内多样性的识别、系统发育研究和蛋白质结合亲和力分析。我们的研究表明,多种 SARS-CoV-2 株可以在用于产生参考序列的病毒样本和加利福尼亚的废水样本中共存。此外,我们的工作流程还证明了它能够识别口蹄疫病毒 (FMDV) 中的种内多样性。通过我们的研究,我们能够阐明这些与已发表的 SARS-CoV-2 参考基因组、SARS-CoV、SARS-CoV-2 的关注变体 (VOC) 以及一些密切相关的冠状病毒的结合亲和力和系统发育关系。这些见解对未来旨在识别种内多样性、了解这些病毒的进化和传播以及开发针对它们的有效治疗方法和疫苗的研究工作具有重要意义。