State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan & School of Life Sciences, Yunnan University, Kunming, 650091, China.
College of Engineering, Honghe University, Mengzi, 661100, China.
Theor Biol Med Model. 2020 Apr 8;17(1):3. doi: 10.1186/s12976-020-00122-x.
CpGs, the major methylation sites in vertebrate genomes, exhibit a high mutation rate from the methylated form of CpG to TpG/CpA and, therefore, influence the evolution of genome composition. However, the quantitative effects of CpG to TpG/CpA mutations on the evolution of genome composition in terms of the dinucleotide frequencies/proportions remain poorly understood.
Based on the neutral theory of molecular evolution, we propose a methylation-driven model (MDM) that allows predicting the changes in frequencies/proportions of the 16 dinucleotides and in the GC content of a genome given the known number of CpG to TpG/CpA mutations. The application of MDM to the 10 published vertebrate genomes shows that, for most of the 16 dinucleotides and the GC content, a good consistency is achieved between the predicted and observed trends of changes in the frequencies and content relative to the assumed initial values, and that the model performs better on the mammalian genomes than it does on the lower-vertebrate genomes. The model's performance depends on the genome composition characteristics, the assumed initial state of the genome, and the estimated parameters, one or more of which are responsible for the different application effects on the mammalian and lower-vertebrate genomes and for the large deviations of the predicted frequencies of a few dinucleotides from their observed frequencies.
Despite certain limitations of the current model, the successful application to the higher-vertebrate (mammalian) genomes witnesses its potential for facilitating studies aimed at understanding the role of methylation in driving the evolution of genome dinucleotide composition.
CpG 是脊椎动物基因组中的主要甲基化位点,其从 CpG 向 TpG/CpA 的甲基化形式发生突变的频率很高,因此影响了基因组组成的进化。然而,CpG 向 TpG/CpA 突变对基因组组成进化中双核苷酸频率/比例的定量影响仍知之甚少。
基于分子进化的中性理论,我们提出了一个甲基化驱动模型(MDM),该模型允许在已知 CpG 向 TpG/CpA 突变数量的情况下,预测 16 种双核苷酸的频率/比例以及基因组 GC 含量的变化。将 MDM 应用于 10 个已发表的脊椎动物基因组表明,对于大多数 16 种双核苷酸和 GC 含量,在相对于假定初始值的频率和含量变化的预测和观察趋势之间,都达到了很好的一致性,并且该模型在哺乳动物基因组上的表现优于在低等脊椎动物基因组上的表现。该模型的性能取决于基因组组成特征、基因组的假定初始状态和估计的参数,其中一个或多个参数负责在哺乳动物和低等脊椎动物基因组上的不同应用效果,以及对一些双核苷酸的预测频率与其观察频率之间的较大偏差负责。
尽管当前模型存在某些局限性,但它在高等脊椎动物(哺乳动物)基因组中的成功应用证明了它在促进旨在理解甲基化在驱动基因组双核苷酸组成进化中的作用的研究方面的潜力。