School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China.
Institute of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China.
Mol Genet Genomics. 2019 Jun;294(3):637-647. doi: 10.1007/s00438-019-01535-1. Epub 2019 Feb 13.
Genomes can be considered a combination of 16 dinucleotides. Analysing the relative abundance of different dinucleotides may reveal important features of genome evolution. In present study, we conducted extensive surveys on the relative abundances of dinucleotides in various genomic components of 28 bacterial, 20 archaean, 19 fungal, 24 plant and 29 animal species. We found that TA, GT and AC are significantly under-represented in open reading frames of all organisms and in intergenic regions and introns of most organisms. Specific dinucleotides are of greatly varied usage at different codon positions. The significantly low representations of TA, GT and AC are considered the evolutionary consequences of preventing formation of pre-mature stop codons and of reducing intron-splicing options in candidate primary mRNA sequences. These data suggest that a reduction of TA and GT occurred on both strands of the DNA sequence at an early stage of de novo gene birth. Interestingly, GT and AC are also significantly under-represented in current prokaryotic genomes, suggesting that ancient prokaryotic protein-coding genes might have contained introns. The greatly varied usages of specific dinucleotides at different codon positions are considered evolutionary accommodations to compensate the unavailability of specific codons and to avoid formation of pre-mature stop codons. This is the first report presenting data of dinucleotide relative abundance to indicate the possible existence of spliceosomal introns in ancient prokaryotic genes and to hypothesize early steps of de novo gene birth.
基因组可以被视为 16 个二核苷酸的组合。分析不同二核苷酸的相对丰度可能揭示基因组进化的重要特征。在本研究中,我们对 28 种细菌、20 种古菌、19 种真菌、24 种植物和 29 种动物物种的各种基因组成分中的二核苷酸相对丰度进行了广泛调查。我们发现,TA、GT 和 AC 在所有生物体的开放阅读框中和大多数生物体的基因间区和内含子中都明显不足。特定的二核苷酸在不同的密码子位置的使用差异很大。TA、GT 和 AC 的明显低表达被认为是防止形成过早终止密码子和减少候选初级 mRNA 序列中内含子剪接选择的进化结果。这些数据表明,在新基因诞生的早期阶段,DNA 序列的两条链上都发生了 TA 和 GT 的减少。有趣的是,GT 和 AC 在当前的原核基因组中也明显不足,这表明古老的原核蛋白编码基因可能含有内含子。特定二核苷酸在不同密码子位置的差异使用被认为是进化适应,以弥补特定密码子的缺失,并避免形成过早终止密码子。这是第一个报告二核苷酸相对丰度数据的报告,表明古老的原核基因中可能存在剪接体内含子,并假设新基因诞生的早期步骤。