Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu Sichuan 610065, China.
Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu Sichuan 610065, China.
Zool Res. 2018 Jul 18;39(4):291-300. doi: 10.24272/j.issn.2095-8137.2018.047. Epub 2018 Apr 11.
The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1-6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (lobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P<0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P<0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations. The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques.
中国特有的藏猕猴目前被国际自然保护联盟(IUCN)列为近危灵长类物种。短串联重复序列(STRs)是指基因组序列中长度为 1-6bp 的重复元件。它们存在于许多生物中,并广泛应用于群体遗传学研究。为了阐明全基因组 STRs 的分布特征,了解藏猕猴之间的变异情况,我们对五个猕猴样本进行了下一代测序的全基因组 STRs 调查。从我们的组装中总共挖掘出了 1,077,790 个完美的 STRs,其 N50 为 4,966bp。单核苷酸重复是最丰富的,其次是四核苷酸和二核苷酸重复。GC 含量和重复分析与其他猕猴的结果一致。此外,使用 STR 分析软件(lobSTR),我们发现五个藏猕猴个体中 STR 碱基对缺失的比例大于插入的比例(P<0.05,t 检验)。我们还发现纯合 STRs 的比例高于杂合 STRs(P<0.05,t 检验),其中峨眉山和简阳藏猕猴的杂合位点多于黄山藏猕猴。峨眉山和简阳个体的插入比例和等位基因的平均值变化略高于黄山个体,这表明两个种群的 STR 等位基因大小存在差异。基于参考基因组鉴定的多态性 STR 位点显示出良好的扩增效率,可用于研究藏猕猴的群体遗传学。基于地理起源,邻接聚类法将这五个猕猴分为两个不同的分支,表明黄山和四川种群之间存在高度的遗传分化。我们阐明了藏猕猴基因组中 STRs 的分布特征,并提供了一种筛选多态性 STRs 的有效方法。我们的研究结果还为今后的猕猴遗传变异研究奠定了基础。