Suppr
超能文献

利用遗传标记和贝叶斯模型平均聚类方法推断群体结构。

Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.

作者信息

Santafé Guzmán, Lozano Jose A, Larrañaga Pedro

机构信息

Computer Science and Artificial Intelligence Department, University of the Basque Country, San Sebastian, Spain.

出版信息

J Comput Biol. 2008 Mar;15(2):207-20. doi: 10.1089/cmb.2007.0051.

DOI:10.1089/cmb.2007.0051

PMID:18312151

Abstract

The analysis of the structure of populations on the basis of genetic data is essential in population genetics. It is used, for instance, to study the evolution of species or to correct for population stratification in association studies. These genetic data, normally based on DNA polymorphisms, may contain irrelevant information that biases the inference of population structure. In this paper we adapt a recently proposed algorithm, named multistart EMA, to be used in the inference of population structure. This algorithm is able to deal with irrelevant information when obtaining the (probabilistic) population partition. Additionally, we present a maker selection test able to obtain the most relevant markers to retrieve that population partition. The proposed algorithm is compared with the widely used STRUCTURE software on the basis of the F(ST) metric and the log-likelihood score. It is shown that the proposed algorithm improves the obtention of the population structure. Moreover, information about relevant markers obtained by the multi-start EMA can be used to improve the results obtained by other methods, correct for population stratification or even also reduce the economical cost of sequencing new samples. The software presented in this paper is available online at http://www.sc.ehu.es/ccwbayes/members/guzman.

摘要

基于遗传数据对种群结构进行分析在群体遗传学中至关重要。例如，它被用于研究物种的进化或在关联研究中校正种群分层。这些通常基于DNA多态性的遗传数据可能包含会使种群结构推断产生偏差的无关信息。在本文中，我们采用了一种最近提出的名为多起点期望最大化算法（multistart EMA）的算法，用于种群结构的推断。该算法在获取（概率性）种群划分时能够处理无关信息。此外，我们提出了一种标记选择测试，能够获取最相关的标记以检索该种群划分。基于F(ST)指标和对数似然分数，将所提出的算法与广泛使用的STRUCTURE软件进行比较。结果表明，所提出的算法改进了种群结构的获取。此外，通过多起点期望最大化算法获得的关于相关标记的信息可用于改善其他方法得到的结果、校正种群分层，甚至还能降低对新样本进行测序的经济成本。本文所介绍的软件可在http://www.sc.ehu.es/ccwbayes/members/guzman在线获取。

相似文献

Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.

J Comput Biol. 2008 Mar;15(2):207-20. doi: 10.1089/cmb.2007.0051.

Bayesian model averaging of naive Bayes for clustering.

IEEE Trans Syst Man Cybern B Cybern. 2006 Oct;36(5):1149-61. doi: 10.1109/tsmcb.2006.874132.

A novel approach for clustering proteomics data using Bayesian fast Fourier transform.

Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15.

The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.

Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28.

A novel Bayesian semiparametric algorithm for inferring population structure and adjusting for case-control association tests.

Biometrics. 2013 Mar;69(1):164-73. doi: 10.1111/biom.12004. Epub 2013 Feb 21.

Spectrum: joint Bayesian inference of population structure and recombination events.

Bioinformatics. 2007 Jul 1;23(13):i479-89. doi: 10.1093/bioinformatics/btm171.

An approximate Bayesian computation approach to overcome biases that arise when using amplified fragment length polymorphism markers to study population structure.

Genetics. 2008 Jun;179(2):927-39. doi: 10.1534/genetics.107.084541. Epub 2008 May 27.

Comparison of Bayesian and maximum-likelihood inference of population genetic parameters.

Bioinformatics. 2006 Feb 1;22(3):341-5. doi: 10.1093/bioinformatics/bti803. Epub 2005 Nov 29.

AMOVA-based clustering of population genetic data.

J Hered. 2012 Sep-Oct;103(5):744-50. doi: 10.1093/jhered/ess047. Epub 2012 Aug 15.

PSMIX: an R package for population structure inference via maximum likelihood method.

BMC Bioinformatics. 2006 Jun 22;7:317. doi: 10.1186/1471-2105-7-317.

引用本文的文献

Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering.

Nucleic Acids Res. 2014 Sep;42(15):e122. doi: 10.1093/nar/gku585. Epub 2014 Jul 16.

The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure.

Heredity (Edinb). 2011 Apr;106(4):625-32. doi: 10.1038/hdy.2010.95. Epub 2010 Aug 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

利用遗传标记和贝叶斯模型平均聚类方法推断群体结构。

Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译