Murugadoss Karthik, Niesen Michiel J M, Raghunathan Bharathwaj, Lenehan Patrick J, Ghosh Pritha, Feener Tyler, Anand Praveen, Simsek Safak, Suratekar Rohit, Hughes Travis K, Soundararajan Venky
nference, Cambridge, MA 02139, USA.
nference, Toronto, ON M5V 1M1, Canada.
PNAS Nexus. 2022 Mar 10;1(1):pgac018. doi: 10.1093/pnasnexus/pgac018. eCollection 2022 Mar.
Highly transmissible or immuno-evasive SARS-CoV-2 variants have intermittently emerged, resulting in repeated COVID-19 surges. With over 6 million SARS-CoV-2 genomes sequenced, there is unprecedented data to decipher the evolution of fitter SARS-CoV-2 variants. Much attention has been directed to studying the functional importance of specific mutations in the Spike protein, but there is limited knowledge of genomic signatures shared by dominant variants. Here, we introduce a method to quantify the genome-wide distinctiveness of polynucleotide fragments (3- to 240-mers) that constitute SARS-CoV-2 sequences. Compared to standard phylogenetic metrics and mutational load, the new metric provides improved separation between Variants of Concern (VOCs; Reference = 89, IQR: 65-108; Alpha = 166, IQR: 149-181; Beta 131, IQR: 114-149; Gamma = 164, IQR: 150-178; Delta = 235, IQR: 217-255; and Omicron = 459, IQR: 395-521). Omicron's high genomic distinctiveness may confer an advantage over prior VOCs and the recently emerged and highly mutated B.1.640.2 (IHU) lineage. Evaluation of 883 lineages highlights that genomic distinctiveness has increased over time ( = 0.37) and that VOCs score significantly higher than contemporary non-VOC lineages, with Omicron among the most distinctive lineages observed. This study demonstrates the value of characterizing SARS-CoV-2 variants by genome-wide polynucleotide distinctiveness and emphasizes the need to go beyond a narrow set of mutations at known sites on the Spike protein. The consistently higher distinctiveness of each emerging VOC compared to prior VOCs suggests that monitoring of genomic distinctiveness would facilitate rapid assessment of viral fitness.
高传播性或免疫逃逸性的严重急性呼吸综合征冠状病毒2(SARS-CoV-2)变异株不断出现,导致新冠疫情反复激增。随着超过600万个SARS-CoV-2基因组被测序,有了前所未有的数据来解读更具适应性的SARS-CoV-2变异株的进化情况。人们对研究刺突蛋白中特定突变的功能重要性给予了大量关注,但对于优势变异株共有的基因组特征却知之甚少。在此,我们介绍一种方法,用于量化构成SARS-CoV-2序列的多核苷酸片段(3至240聚体)在全基因组范围内的独特性。与标准系统发育指标和突变负荷相比,新指标在关注变异株(VOCs)之间提供了更好的区分度(参考株=89,四分位距:65 - 108;阿尔法变异株=166,四分位距:149 - 181;贝塔变异株=131,四分位距:114 - 149;伽马变异株=164,四分位距:150 - 178;德尔塔变异株=235,四分位距:217 - 255;奥密克戎变异株=459,四分位距:395 - 521)。奥密克戎变异株高度的基因组独特性可能使其相对于先前的VOCs以及最近出现且高度变异的B.1.640.2(IHU)谱系具有优势。对883个谱系的评估表明,基因组独特性随时间增加(斜率=0.37),且VOCs的得分显著高于当代非VOC谱系,奥密克戎变异株是观察到的最具独特性的谱系之一。这项研究证明了通过全基因组多核苷酸独特性来表征SARS-CoV-2变异株的价值,并强调需要超越刺突蛋白已知位点上的一小部分突变。与先前的VOCs相比,每个新出现的VOC持续具有更高的独特性,这表明监测基因组独特性将有助于快速评估病毒适应性。