Misawa Kazuharu, Ootsuki Ryo
Department of Human Genetics, Yokohama City University Graduate School of Medicine, 3-9 Fukuura, Kanazawa-ku, Yokohama 236-0004, Japan.
RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
NAR Genom Bioinform. 2024 Feb 2;6(1):lqae009. doi: 10.1093/nargab/lqae009. eCollection 2024 Mar.
SARS-CoV-2 is the cause of the current worldwide pandemic of severe acute respiratory syndrome. The change of nucleotide composition of the SARS-CoV-2 genome is crucial for understanding the spread and transmission dynamics of the virus because viral nucleotide sequences are essential in identifying viral strains. Recent studies have shown that cytosine (C) to uracil (U) substitutions are overrepresented in SARS-CoV-2 genome sequences. These asymmetric substitutions between C and U indicate that traditional time-reversible substitution models cannot be applied to the evolution of SARS-CoV-2 sequences. Thus, we develop a new time-irreversible model of nucleotide substitutions to estimate the substitution rates in SARS-CoV-2 genomes. We investigated the number of nucleotide substitutions among the 7862 genomic sequences of SARS-CoV-2 registered in the Global Initiative on Sharing All Influenza Data (GISAID) that have been sampled from all over the world. Using the new method, the substitution rates in SARS-CoV-2 genomes were estimated. The C-to-U substitution rates of SARS-CoV-2 were estimated to be 1.95 × 10 ± 4.88 × 10 per site per year, compared with 1.48 × 10 ± 7.42 × 10 per site per year for all other types of substitutions.
严重急性呼吸综合征冠状病毒2(SARS-CoV-2)是当前全球大流行的严重急性呼吸综合征的病原体。SARS-CoV-2基因组核苷酸组成的变化对于理解病毒的传播和传播动态至关重要,因为病毒核苷酸序列对于识别病毒株至关重要。最近的研究表明,胞嘧啶(C)到尿嘧啶(U)的替换在SARS-CoV-2基因组序列中占比过高。C和U之间的这些不对称替换表明传统的时间可逆替换模型不能应用于SARS-CoV-2序列的进化。因此,我们开发了一种新的核苷酸替换时间不可逆模型来估计SARS-CoV-2基因组中的替换率。我们研究了全球共享所有流感数据倡议(GISAID)中登记的、从世界各地采样的7862个SARS-CoV-2基因组序列中的核苷酸替换数量。使用新方法,估计了SARS-CoV-2基因组中的替换率。SARS-CoV-2的C到U替换率估计为每年每个位点1.95×10±4.88×10,而所有其他类型替换的该替换率为每年每个位点1.48×10±7.42×10。