Laboratory of Neurogenetics, National Institute on Alcohol Abuse and Alcoholism, NIH, Bethesda, MD 20892, USA.
BMC Genet. 2012 Jun 29;13:52. doi: 10.1186/1471-2156-13-52.
As a model organism in biomedicine, the rhesus macaque (Macaca mulatta) is the most widely used nonhuman primate. Although a draft genome sequence was completed in 2007, there has been no systematic genome-wide comparison of genetic variation of this species to humans. Comparative analysis of functional and nonfunctional diversity in this highly abundant and adaptable non-human primate could inform its use as a model for human biology, and could reveal how variation in population history and size alters patterns and levels of sequence variation in primates.
We sequenced the mRNA transcriptome and H3K4me3-marked DNA regions in hippocampus from 14 humans and 14 rhesus macaques. Using equivalent methodology and sampling spaces, we identified 462,802 macaque SNPs, most of which were novel and disproportionately located in the functionally important genomic regions we had targeted in the sequencing. At least one SNP was identified in each of 16,797 annotated macaque genes. Accuracy of macaque SNP identification was conservatively estimated to be >90%. Comparative analyses using SNPs equivalently identified in the two species revealed that rhesus macaque has approximately three times higher SNP density and average nucleotide diversity as compared to the human. Based on this level of diversity, the effective population size of the rhesus macaque is approximately 80,000 which contrasts with an effective population size of less than 10,000 for humans. Across five categories of genomic regions, intergenic regions had the highest SNP density and average nucleotide diversity and CDS (coding sequences) the lowest, in both humans and macaques. Although there are more coding SNPs (cSNPs) per individual in macaques than in humans, the ratio of dN/dS is significantly lower in the macaque. Furthermore, the number of damaging nonsynonymous cSNPs (have damaging effects on protein functions from PolyPhen-2 prediction) in the macaque is more closely equivalent to that of the human.
This large panel of newly identified macaque SNPs enriched for functionally significant regions considerably expands our knowledge of genetic variation in the rhesus macaque. Comparative analysis reveals that this widespread, highly adaptable species is approximately three times as diverse as the human but more closely equivalent in damaging variation.
作为生物医学中的模式生物,食蟹猴(Macaca mulatta)是最广泛使用的非人类灵长类动物。尽管 2007 年已经完成了一个草图基因组序列,但该物种与人类之间还没有进行过系统的全基因组遗传变异比较。在这种高度丰富和适应性强的非人类灵长类动物中,对功能和非功能多样性进行比较分析,可以为其作为人类生物学模型的应用提供信息,并揭示种群历史和规模的变化如何改变灵长类动物的序列变异模式和水平。
我们对 14 名人类和 14 只食蟹猴的海马体 mRNA 转录组和 H3K4me3 标记的 DNA 区域进行了测序。使用等效的方法和采样空间,我们鉴定了 462,802 个食蟹猴 SNP,其中大多数是新的,而且不成比例地位于我们在测序中靶向的功能重要基因组区域。在 16,797 个注释的食蟹猴基因中,每个基因都至少鉴定出一个 SNP。保守估计,食蟹猴 SNP 鉴定的准确率>90%。使用在两个物种中等效鉴定的 SNP 进行的比较分析表明,与人类相比,食蟹猴的 SNP 密度和平均核苷酸多样性大约高出三倍。基于这种多样性水平,食蟹猴的有效种群大小约为 8 万,而人类的有效种群大小则小于 1 万。在五类基因组区域中,基因间区域的 SNP 密度和平均核苷酸多样性最高,而 CDS(编码序列)最低,人类和食蟹猴都是如此。尽管食蟹猴每个个体的编码 SNP(cSNP)数量多于人类,但食蟹猴的 dN/dS 比值明显较低。此外,食蟹猴中具有破坏性的非同义 cSNP(根据 PolyPhen-2 预测对蛋白质功能有破坏性影响)的数量与人类更为接近。
本研究新鉴定的大量食蟹猴 SNP 富集了功能显著的区域,极大地扩展了我们对食蟹猴遗传变异的认识。比较分析表明,这种广泛分布、适应性很强的物种的多样性大约是人类的三倍,但在破坏性变异方面更为接近。