Zhao Zhongming, Boerwinkle Eric
Human Genetics Center and Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
Genome Res. 2002 Nov;12(11):1679-86. doi: 10.1101/gr.287302.
We investigated substitution patterns and neighboring-nucleotide effects for 2,576,903 single nucleotide polymorphisms (SNPs) publicly available through the National Center for Biotechnology Information (NCBI). The proportions of substitutions were A/G, 32.77%; C/T, 32.81%; A/C, 8.98%; G/T, 9.06%; A/T, 7.46%; and C/G, 8.92%. The two nucleotides immediately neighboring the variable site showed major deviation from genome-wide and chromosome-specific expectations, although lesser biases extended as far as 200 bp. On the 5' side, the biases for A, C, G, and T were 1.43%, 4.91%, -1.70%, and -4.62%, respectively. These biases were -4.44%, -1.59%, 5.05%, and 0.99%, respectively, on the 3' side. The neighboring-nucleotide patterns for transitions were dominated by the hypermutability effects of CpG dinucleotides. Transitions were more common than transversions, and the probability of a transversion increased with increasing A + T content at the two adjacent sites. Neighboring-nucleotide biases were not consistent among chromosomes, with Chromosomes 19 and 22 standing out as different from the others. These data provide genome-wide information about the effects of neighboring nucleotides on mutational and evolutionary processes giving rise to contemporary patterns of nucleotide occurrence surrounding SNPs.
我们调查了通过美国国立生物技术信息中心(NCBI)公开获取的2,576,903个单核苷酸多态性(SNP)的替换模式和相邻核苷酸效应。替换比例为:A/G,32.77%;C/T,32.81%;A/C,8.98%;G/T,9.06%;A/T,7.46%;以及C/G,8.92%。紧邻可变位点的两个核苷酸显示出与全基因组和特定染色体预期的主要偏差,尽管较小的偏差延伸至200 bp。在5'端,A、C、G和T的偏差分别为1.43%、4.91%、-1.70%和-4.62%。在3'端,这些偏差分别为-4.44%、-1.59%、5.05%和0.99%。转换的相邻核苷酸模式主要由CpG二核苷酸的高突变效应主导。转换比颠换更常见,并且在两个相邻位点,颠换的概率随着A + T含量的增加而增加。相邻核苷酸偏差在各染色体之间不一致,19号和22号染色体与其他染色体不同。这些数据提供了全基因组范围内关于相邻核苷酸对导致SNP周围当代核苷酸出现模式的突变和进化过程影响的信息。