Tomso Daniel J, Bell Douglas A
Laboratory of Computational Biology and Risk Analysis, National Institute of Environmental Health Sciences, C3-03 P.O. Box 12233, Research Triangle Park, NC 27709, USA.
J Mol Biol. 2003 Mar 21;327(2):303-8. doi: 10.1016/s0022-2836(03)00120-7.
Human polymorphisms originate as mutations, and the influence of context on mutagenesis should be reflected in the distribution of sequences surrounding single nucleotide polymorphisms (SNPs). We have performed a computational survey of nearly two million human SNPs to determine if sequence-dependent hotspots for polymorphism exist in the human genome. Here we show that sequences containing CpG dinucleotides, which occur at low frequencies in the human genome, are 6.7-fold more abundant at polymorphic sites than expected. In contrast, polymorphisms in CpG sequences located within CpG islands, important regulatory regions that modulate gene expression, are 6.8-fold less prevalent than expected. The distribution of polymorphic alleles at CpGs in CpG islands is also significantly different from that in non-island regions. These data strongly support a role for 5-methylcytosine deamination in the generation of human variation, and suggest that variation at CpGs in islands is suppressed.
人类多态性起源于突变,而背景对诱变的影响应反映在单核苷酸多态性(SNP)周围序列的分布中。我们对近两百万个人类SNP进行了计算调查,以确定人类基因组中是否存在序列依赖性的多态性热点。在此我们表明,在人类基因组中出现频率较低的含有CpG二核苷酸的序列,在多态性位点的丰度比预期高6.7倍。相比之下,位于CpG岛(调节基因表达的重要调控区域)内的CpG序列中的多态性比预期低6.8倍。CpG岛中CpG位点的多态性等位基因分布也与非岛区显著不同。这些数据有力地支持了5-甲基胞嘧啶脱氨在人类变异产生中的作用,并表明岛中CpG位点的变异受到抑制。