Metzgar D, Bytof J, Wills C
Department of Biology, University of California at San Diego, La Jolla, California 92093-0116 USA.
Genome Res. 2000 Jan;10(1):72-80.
Microsatellite enrichment is an excess of repetitive sequences characteristic to all studied eukaryotes. It is thought to result from the accumulated effects of replication slippage mutations. Enrichment is commonly measured as the ratio of the observed frequency of microsatellites to the frequency expected to result from random association of nucleotides. We have compared enrichment of specific types of microsatellites in coding sequences with those in noncoding sequences across seven eukaryotic clades. The results reveal consistent differences between coding and noncoding regions, in terms of both the quantity of repetitive DNA and the types present. In noncoding regions, all types of microsatellite (mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats) are found in excess, and in all cases, these excesses scale in a similar exponential fashion with the length of the microsatellite. This suggests that all types of noncoding repeats are subject to similar mutational and selective processes. Coding repeats, however, appear to be under much stronger and more specific constraints. Tri- and hexanucleotide repeats are found in consistent and significant excess over a wide range of lengths in both coding and noncoding sequences, but other repeat types are much less frequent in coding regions than in noncoding regions. These findings suggest that the differences between coding and noncoding microsatellite frequencies arise from specific selection against frameshift mutations in coding regions resulting from length changes in nontriplet repeats. Furthermore, the excesses of tri- and hexanucleotide coding repeats appear to be controlled primarily by mutation pressure.
微卫星富集是所有已研究真核生物所特有的重复序列过量现象。它被认为是复制滑动突变累积效应的结果。富集通常以观察到的微卫星频率与核苷酸随机组合预期产生的频率之比来衡量。我们比较了七个真核生物分支中编码序列和非编码序列中特定类型微卫星的富集情况。结果显示,在重复DNA的数量和存在的类型方面,编码区和非编码区之间存在一致的差异。在非编码区,所有类型的微卫星(单核苷酸、二核苷酸、三核苷酸、四核苷酸、五核苷酸和六核苷酸重复)都过量存在,并且在所有情况下,这些过量都以类似的指数方式随着微卫星长度而变化。这表明所有类型的非编码重复都受到相似的突变和选择过程的影响。然而,编码重复似乎受到更强且更特定的限制。在编码序列和非编码序列的广泛长度范围内,三核苷酸和六核苷酸重复一致且显著过量,但其他重复类型在编码区比在非编码区的频率要低得多。这些发现表明,编码区和非编码区微卫星频率的差异源于对非三联体重复长度变化导致的编码区移码突变的特定选择。此外,三核苷酸和六核苷酸编码重复的过量似乎主要受突变压力控制。