Patil Shruti S, Catanese Helen N, Brayton Kelly A, Lofgren Eric T, Gebremedhin Assefaw H
School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164, USA.
Department of Veterinary Microbiology and Pathology, Washington State University, Pullman, WA 99164, USA.
Viruses. 2022 Jul 29;14(8):1672. doi: 10.3390/v14081672.
Severe acute respiratory syndrome-related coronavirus (SARS-CoV-2), which still infects hundreds of thousands of people globally each day despite various countermeasures, has been mutating rapidly. Mutations in the spike (S) protein seem to play a vital role in viral stability, transmission, and adaptability. Therefore, to control the spread of the virus, it is important to gain insight into the evolution and transmission of the S protein. This study deals with the temporal and geographical distribution of mutant S proteins from sequences gathered across the US over a period of 19 months in 2020 and 2021. The S protein sequences are studied using two approaches: (i) multiple sequence alignment is used to identify prominent mutations and highly mutable regions and (ii) sequence similarity networks are subsequently employed to gain further insight and study mutation profiles of concerning variants across the defined time periods and states. Additionally, we tracked the variants using visualizations on geographical maps. The visualizations produced using the Directed Weighted All Nearest Neighbors (DiWANN) networks and maps provided insights into the transmission of the virus that reflect well the statistics reported for the time periods studied. We found that the networks created using DiWANN are superior to commonly used approximate distance networks created using BLAST bitscores. The study offers a richer computational approach to analyze the transmission profile of the prominent S protein mutations in SARS-CoV-2 and can be extended to other proteins and viruses.
严重急性呼吸综合征相关冠状病毒(SARS-CoV-2)尽管采取了各种应对措施,但每天仍在全球感染数十万人,并且一直在迅速变异。刺突(S)蛋白的突变似乎在病毒的稳定性、传播和适应性方面起着至关重要的作用。因此,为了控制病毒的传播,深入了解S蛋白的进化和传播情况非常重要。本研究探讨了2020年和2021年19个月期间从美国各地收集的序列中突变S蛋白的时间和地理分布。使用两种方法研究S蛋白序列:(i)多序列比对用于识别突出的突变和高度可变区域,(ii)随后使用序列相似性网络以进一步深入了解并研究在定义的时间段和各州内相关变体的突变谱。此外,我们通过地理地图上的可视化来追踪这些变体。使用定向加权全最近邻(DiWANN)网络和地图生成的可视化提供了有关病毒传播的见解,很好地反映了所研究时间段内报告的统计数据。我们发现,使用DiWANN创建的网络优于使用BLAST比特得分创建的常用近似距离网络。该研究提供了一种更丰富的计算方法来分析SARS-CoV-2中突出的S蛋白突变的传播情况,并且可以扩展到其他蛋白质和病毒。