Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular Biology, Chongqing Normal University, Chongqing, China.
Insect Sci. 2019 Aug;26(4):607-619. doi: 10.1111/1744-7917.12577. Epub 2018 Apr 6.
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.
简单序列重复(SSR)存在于真核生物和原核生物基因组中,是最受欢迎的遗传标记,但蚊子基因组中的 SSR 仍未得到很好的理解。本研究在全基因组水平上以黑腹果蝇为参照,鉴定和分析了 23 种蚊子物种中的 SSR。结果表明,SSR 数量(33076-560175/基因组)和基因组大小(574.57-1342.21 Mb)呈显著正相关(R = 0.8992,P < 0.01),但在这些蚊子物种中,个体物种之间的相关性有所不同。在六种类型的 SSR 中,单核苷酸至三核苷酸 SSR 占主导地位,累积百分比为 95.14%-99.00%,密度为 195.65/Mb-787.51/Mb,而四核苷酸至六核苷酸 SSR 则很少,占 1.12%-4.22%,密度为 3.76/Mb-40.23/Mb。单核苷酸、二核苷酸和三核苷酸 SSR 中的最常见基序分别为(A/T)n、(AC/GT)n 和(AGC/GCT)n,而四核苷酸至六核苷酸 SSR 的基序频率似乎是物种特异性的。SSR 的 10-20 bp 长度占主导地位,数量为 110561 ± 93482,频率为 87.25% ± 5.73%,平均长度随长度的增加而下降。大多数 SSR(83.34% ± 7.72%)位于基因间区域,其次是内含子区域(11.59% ± 5.59%)、外显子区域(3.74% ± 1.95%)和非翻译区(1.32% ± 1.39%)。单核苷酸、二核苷酸和三核苷酸 SSR 是基因区域(98.55% ± 0.85%)和外显子区域(99.27% ± 0.52%)的主要 SSR。平均有 42.52%的基因包含 SSR,不同基因亚类中 SSR 发生的偏好是物种特异性的。本研究为了解 23 种蚊子物种基因组中的 SSR 多样性、特征和分布提供了有用的见解。