Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310058, China.
Technical Centre for Animal Plant and Food Inspection and Quarantine, Shanghai Entry-exit Inspection and Quarantine Bureau, Shanghai, 200135, China.
BMC Genomics. 2017 Nov 6;18(1):848. doi: 10.1186/s12864-017-4234-0.
Simple sequence repeats (SSR), also called microsatellites, have been widely used as genetic markers, and have been extensively studied in some model insects. At present, the genomes of more than 100 insect species are available. However, the features of SSRs in most insect genomes remain largely unknown.
We identified 15.01 million SSRs across 136 insect genomes. The number of identified SSRs was positively associated with genome size in insects, but the frequency and density per megabase of genomes were not. Most insect SSRs (56.2-93.1%) were perfect (no mismatch). Imperfect (at least one mismatch) SSRs (average length 22-73 bp) were longer than perfect SSRs (16-30 bp). The most abundant insect SSRs were the di- and trinucleotide types, which accounted for 27.2% and 22.0% of all SSRs, respectively. On average, 59.1%, 36.8%, and 3.7% of insect SSRs were located in intergenic, intronic, and exonic regions, respectively. The percentages of various types of SSRs were similar among insects from the same family. However, they were dissimilar among insects from different families within orders. We carried out a phylogenetic analysis using the SSR frequencies. Species from the same family were generally clustered together in the evolutionary tree. However, insects from the same order but not in the same family did not cluster together. These results indicated that although SSRs undergo rapid expansions and contractions in different populations of the same species, the general genomic features of insect SSRs remain conserved at the family level.
Millions of insect SSRs were identified and their genome features were analyzed. Most insect SSRs were perfect and were located in intergenic regions. We presented evidence that the variance of insect SSRs accumulated after the differentiation of insect families.
简单重复序列(SSR),也称为微卫星,已被广泛用作遗传标记,并在一些模式昆虫中进行了广泛研究。目前,已有 100 多种昆虫的基因组可供使用。然而,大多数昆虫基因组中的 SSR 特征在很大程度上仍然未知。
我们在 136 种昆虫基因组中鉴定出 1501 万个 SSR。鉴定出的 SSR 数量与昆虫的基因组大小呈正相关,但每个基因组兆碱基的频率和密度却没有。大多数昆虫 SSR(56.2-93.1%)是完美的(没有不匹配)。不完美的(至少有一个不匹配)SSR(平均长度 22-73bp)比完美的 SSR(16-30bp)更长。最丰富的昆虫 SSR 是二核苷酸和三核苷酸类型,分别占所有 SSR 的 27.2%和 22.0%。平均而言,昆虫 SSR 的 59.1%、36.8%和 3.7%分别位于基因间区、内含子区和外显子区。同一科的昆虫中各种类型 SSR 的比例相似。然而,在目内不同科的昆虫中则不同。我们使用 SSR 频率进行了系统发育分析。同一科的物种通常在进化树上聚在一起。然而,来自同一目的但不属于同一科的昆虫并没有聚集在一起。这些结果表明,尽管 SSR 在同一物种的不同种群中经历了快速的扩张和收缩,但昆虫 SSR 的一般基因组特征在科水平上是保守的。
鉴定了数百万个昆虫 SSR,并分析了它们的基因组特征。大多数昆虫 SSR 是完美的,位于基因间区。我们提供了证据表明,昆虫科分化后 SSR 的变异积累。