• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用遗传标记和贝叶斯模型平均聚类方法推断群体结构。

Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.

作者信息

Santafé Guzmán, Lozano Jose A, Larrañaga Pedro

机构信息

Computer Science and Artificial Intelligence Department, University of the Basque Country, San Sebastian, Spain.

出版信息

J Comput Biol. 2008 Mar;15(2):207-20. doi: 10.1089/cmb.2007.0051.

DOI:10.1089/cmb.2007.0051
PMID:18312151
Abstract

The analysis of the structure of populations on the basis of genetic data is essential in population genetics. It is used, for instance, to study the evolution of species or to correct for population stratification in association studies. These genetic data, normally based on DNA polymorphisms, may contain irrelevant information that biases the inference of population structure. In this paper we adapt a recently proposed algorithm, named multistart EMA, to be used in the inference of population structure. This algorithm is able to deal with irrelevant information when obtaining the (probabilistic) population partition. Additionally, we present a maker selection test able to obtain the most relevant markers to retrieve that population partition. The proposed algorithm is compared with the widely used STRUCTURE software on the basis of the F(ST) metric and the log-likelihood score. It is shown that the proposed algorithm improves the obtention of the population structure. Moreover, information about relevant markers obtained by the multi-start EMA can be used to improve the results obtained by other methods, correct for population stratification or even also reduce the economical cost of sequencing new samples. The software presented in this paper is available online at http://www.sc.ehu.es/ccwbayes/members/guzman.

摘要

基于遗传数据对种群结构进行分析在群体遗传学中至关重要。例如,它被用于研究物种的进化或在关联研究中校正种群分层。这些通常基于DNA多态性的遗传数据可能包含会使种群结构推断产生偏差的无关信息。在本文中,我们采用了一种最近提出的名为多起点期望最大化算法(multistart EMA)的算法,用于种群结构的推断。该算法在获取(概率性)种群划分时能够处理无关信息。此外,我们提出了一种标记选择测试,能够获取最相关的标记以检索该种群划分。基于F(ST)指标和对数似然分数,将所提出的算法与广泛使用的STRUCTURE软件进行比较。结果表明,所提出的算法改进了种群结构的获取。此外,通过多起点期望最大化算法获得的关于相关标记的信息可用于改善其他方法得到的结果、校正种群分层,甚至还能降低对新样本进行测序的经济成本。本文所介绍的软件可在http://www.sc.ehu.es/ccwbayes/members/guzman在线获取。

相似文献

1
Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.利用遗传标记和贝叶斯模型平均聚类方法推断群体结构。
J Comput Biol. 2008 Mar;15(2):207-20. doi: 10.1089/cmb.2007.0051.
2
Bayesian model averaging of naive Bayes for clustering.用于聚类的朴素贝叶斯的贝叶斯模型平均法。
IEEE Trans Syst Man Cybern B Cybern. 2006 Oct;36(5):1149-61. doi: 10.1109/tsmcb.2006.874132.
3
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.一种使用贝叶斯快速傅里叶变换对蛋白质组学数据进行聚类的新方法。
Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15.
4
The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.近亲对群体遗传结构分析中无监督贝叶斯聚类算法的影响。
Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28.
5
A novel Bayesian semiparametric algorithm for inferring population structure and adjusting for case-control association tests.一种用于推断群体结构并对病例对照关联检验进行校正的新型贝叶斯半参数算法。
Biometrics. 2013 Mar;69(1):164-73. doi: 10.1111/biom.12004. Epub 2013 Feb 21.
6
Spectrum: joint Bayesian inference of population structure and recombination events.频谱:群体结构与重组事件的联合贝叶斯推断
Bioinformatics. 2007 Jul 1;23(13):i479-89. doi: 10.1093/bioinformatics/btm171.
7
An approximate Bayesian computation approach to overcome biases that arise when using amplified fragment length polymorphism markers to study population structure.一种近似贝叶斯计算方法,用于克服在使用扩增片段长度多态性标记研究群体结构时出现的偏差。
Genetics. 2008 Jun;179(2):927-39. doi: 10.1534/genetics.107.084541. Epub 2008 May 27.
8
Comparison of Bayesian and maximum-likelihood inference of population genetic parameters.群体遗传参数的贝叶斯推断与最大似然推断比较
Bioinformatics. 2006 Feb 1;22(3):341-5. doi: 10.1093/bioinformatics/bti803. Epub 2005 Nov 29.
9
AMOVA-based clustering of population genetic data.基于 AMOVA 的群体遗传数据分析聚类。
J Hered. 2012 Sep-Oct;103(5):744-50. doi: 10.1093/jhered/ess047. Epub 2012 Aug 15.
10
PSMIX: an R package for population structure inference via maximum likelihood method.PSMIX:一个用于通过最大似然法进行群体结构推断的R软件包。
BMC Bioinformatics. 2006 Jun 22;7:317. doi: 10.1186/1471-2105-7-317.

引用本文的文献

1
Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering.利用Y染色体中最小数量的独立进化标记推断群体结构和关系:一种用于层次聚类的递归特征选择混合方法。
Nucleic Acids Res. 2014 Sep;42(15):e122. doi: 10.1093/nar/gku585. Epub 2014 Jul 16.
2
The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure.STRUCTURE 计算机程序不能可靠地识别物种内的主要遗传聚类:模拟及其对人类群体结构的影响。
Heredity (Edinb). 2011 Apr;106(4):625-32. doi: 10.1038/hdy.2010.95. Epub 2010 Aug 4.