Shringarpure Suyash, Xing Eric P
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15215, USA.
Genetics. 2009 Jun;182(2):575-93. doi: 10.1534/genetics.108.100222. Epub 2009 Apr 10.
Traditional methods for analyzing population structure, such as the Structure program, ignore the influence of the effect of allele mutations between the ancestral and current alleles of genetic markers, which can dramatically influence the accuracy of the structural estimation of current populations. Studying these effects can also reveal additional information about population evolution such as the divergence time and migration history of admixed populations. We propose mStruct, an admixture of population-specific mixtures of inheritance models that addresses the task of structure inference and mutation estimation jointly through a hierarchical Bayesian framework, and a variational algorithm for inference. We validated our method on synthetic data and used it to analyze the Human Genome Diversity Project-Centre d'Etude du Polymorphisme Humain (HGDP-CEPH) cell line panel of microsatellites and HGDP single-nucleotide polymorphism (SNP) data. A comparison of the structural maps of world populations estimated by mStruct and Structure is presented, and we also report potentially interesting mutation patterns in world populations estimated by mStruct.
传统的群体结构分析方法,如Structure程序,忽略了遗传标记的祖先等位基因与当前等位基因之间的等位基因突变效应的影响,这可能会显著影响当前群体结构估计的准确性。研究这些效应还可以揭示有关群体进化的其他信息,例如混合群体的分化时间和迁移历史。我们提出了mStruct,它是特定群体遗传模型混合物的混合模型,通过分层贝叶斯框架联合解决结构推断和突变估计任务,并提出了一种变分推理算法。我们在合成数据上验证了我们的方法,并将其用于分析人类基因组多样性计划 - 人类多态性研究中心(HGDP-CEPH)细胞系微卫星面板和HGDP单核苷酸多态性(SNP)数据。展示了通过mStruct和Structure估计的世界群体结构图谱的比较,我们还报告了通过mStruct估计的世界群体中潜在有趣的突变模式。