• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无限位点模型的重要性抽样

Importance sampling for the infinite sites model.

作者信息

Hobolth Asger, Uyenoyama Marcy K, Wiuf Carsten

机构信息

Aarhus University.

出版信息

Stat Appl Genet Mol Biol. 2008;7(1):Article32. doi: 10.2202/1544-6115.1400. Epub 2008 Oct 30.

DOI:10.2202/1544-6115.1400
PMID:18976228
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2832804/
Abstract

Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for the infinite sites model. The infinite sites assumption is attractive because it constraints the number of possible genealogies, thereby allowing for the analysis of larger data sets. We recall the Griffiths-Tavaré and Stephens-Donnelly proposals and emphasize the relation between the latter proposal and exact sampling from the infinite alleles model. We also introduce a new proposal that takes knowledge of the ancestral state into account. The new proposal is derived from a new result on exact sampling from a single site. The methods are illustrated on simulated data sets and the data considered in Griffiths and Tavaré (1994).

摘要

对于群体遗传学数据的前沿统计分析,需要重要性抽样或马尔可夫链蒙特卡罗抽样。这些基于抽样的推断技术的适用性关键取决于提议分布。在本文中,我们讨论无限位点模型的重要性抽样。无限位点假设很有吸引力,因为它限制了可能的系谱数量,从而允许分析更大的数据集。我们回顾了格里菲斯 - 塔瓦雷和斯蒂芬斯 - 唐纳利提议,并强调了后者提议与从无限等位基因模型进行精确抽样之间的关系。我们还引入了一种考虑祖先状态知识的新提议。该新提议源自关于单个位点精确抽样的一个新结果。这些方法在模拟数据集以及格里菲斯和塔瓦雷(1994)中所考虑的数据上进行了说明。

相似文献

1
Importance sampling for the infinite sites model.无限位点模型的重要性抽样
Stat Appl Genet Mol Biol. 2008;7(1):Article32. doi: 10.2202/1544-6115.1400. Epub 2008 Oct 30.
2
Importance sampling for Lambda-coalescents in the infinitely many sites model.无限多位点模型中Lambda合并过程的重要性抽样
Theor Popul Biol. 2011 Jun;79(4):155-73. doi: 10.1016/j.tpb.2011.01.005. Epub 2011 Feb 4.
3
Exact computation of coalescent likelihood for panmictic and subdivided populations under the infinite sites model.无限位点模型下,均匀群体和分裂群体的合并似然的精确计算。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):611-8. doi: 10.1109/TCBB.2010.2.
4
Efficient simulation and likelihood methods for non-neutral multi-allele models.非中性多等位基因模型的高效模拟和似然方法。
J Comput Biol. 2012 Jun;19(6):650-61. doi: 10.1089/cmb.2012.0033.
5
Markov chain Monte Carlo sampling of gene genealogies conditional on unphased SNP genotype data.基于未分型单核苷酸多态性(SNP)基因型数据的基因谱系的马尔可夫链蒙特卡罗抽样。
Stat Appl Genet Mol Biol. 2013 Oct 1;12(5):559-81. doi: 10.1515/sagmb-2012-0011.
6
Exact test of Hardy-Weinberg equilibrium by Markov chain Monte Carlo.通过马尔可夫链蒙特卡罗方法对哈迪-温伯格平衡进行精确检验。
Math Med Biol. 2003 Dec;20(4):327-40. doi: 10.1093/imammb/20.4.327.
7
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
8
Convergence time to the Ewens sampling formula in the infinite alleles Moran model.无限等位基因莫兰模型中向尤恩斯抽样公式的收敛时间。
J Math Biol. 2010 Feb;60(2):189-206. doi: 10.1007/s00285-009-0255-x. Epub 2009 Mar 15.
9
Estimating genealogies from linked marker data: a Bayesian approach.从连锁标记数据估计系谱:一种贝叶斯方法。
BMC Bioinformatics. 2007 Oct 25;8:411. doi: 10.1186/1471-2105-8-411.
10
Postprocessing of genealogical trees.系谱树的后处理。
Genetics. 2007 Sep;177(1):347-58. doi: 10.1534/genetics.107.071910. Epub 2007 Jun 11.

引用本文的文献

1
Allele frequency spectra in structured populations: Novel-allele probabilities under the labelled coalescent.结构化群体中的等位基因频率谱:标记合并下的新等位基因概率。
Theor Popul Biol. 2020 Jun;133:130-140. doi: 10.1016/j.tpb.2020.01.002. Epub 2020 Mar 3.
2
Bayesian Estimation of Population Size Changes by Sampling Tajima's Trees.贝叶斯估计抽样 Tajima 树的种群大小变化。
Genetics. 2019 Nov;213(3):967-986. doi: 10.1534/genetics.119.302373. Epub 2019 Sep 11.
3
Two-Locus Likelihoods Under Variable Population Size and Fine-Scale Recombination Rate Estimation.可变种群大小下的两位点似然性与精细尺度重组率估计
Genetics. 2016 Jul;203(3):1381-99. doi: 10.1534/genetics.115.184820. Epub 2016 May 10.
4
Coalescent: an open-science framework for importance sampling in coalescent theory.合并:一种用于合并理论中重要性抽样的开放科学框架。
PeerJ. 2015 Aug 18;3:e1203. doi: 10.7717/peerj.1203. eCollection 2015.
5
An analytical framework in the general coalescent tree setting for analyzing polymorphisms created by two mutations.用于分析由两个突变产生的多态性的一般合并树设置中的分析框架。
J Math Biol. 2015 Mar;70(4):913-56. doi: 10.1007/s00285-014-0785-8. Epub 2014 Apr 24.
6
Coalescent: an open-source and scalable framework for exact calculations in coalescent theory.Coalescent:一个用于合并理论中精确计算的开源和可扩展框架。
BMC Bioinformatics. 2012 Oct 3;13:257. doi: 10.1186/1471-2105-13-257.
7
Stopping-time resampling and population genetic inference under coalescent models.在溯祖模型下的停止时间重采样与群体遗传推断
Stat Appl Genet Mol Biol. 2012 Jan 6;11(1):Article 9. doi: 10.2202/1544-6115.1770.
8
Importance sampling for Lambda-coalescents in the infinitely many sites model.无限多位点模型中Lambda合并过程的重要性抽样
Theor Popul Biol. 2011 Jun;79(4):155-73. doi: 10.1016/j.tpb.2011.01.005. Epub 2011 Feb 4.
9
Topologies of the conditional ancestral trees and full-likelihood-based inference in the general coalescent tree framework.条件祖先树的拓扑结构和广义融合树框架中的全似然推理。
Genetics. 2010 Aug;185(4):1355-68. doi: 10.1534/genetics.109.112847. Epub 2010 May 17.
10
Site frequency spectra from genomic SNP surveys.来自基因组单核苷酸多态性(SNP)调查的位点频率谱。
Theor Popul Biol. 2009 Jun;75(4):346-54. doi: 10.1016/j.tpb.2009.04.003. Epub 2009 Apr 14.

本文引用的文献

1
Counting all possible ancestral configurations of sample sequences in population genetics.计算群体遗传学中样本序列的所有可能祖先构型。
IEEE/ACM Trans Comput Biol Bioinform. 2006 Jul-Sep;3(3):239-51. doi: 10.1109/TCBB.2006.31.
2
Ewens' sampling formula and related formulae: combinatorial proofs, extensions to variable population size and applications to ages of alleles.尤恩斯抽样公式及相关公式:组合证明、可变群体大小的扩展以及等位基因年龄的应用。
Theor Popul Biol. 2005 Nov;68(3):167-77. doi: 10.1016/j.tpb.2005.02.004.
3
Generating samples under a Wright-Fisher neutral model of genetic variation.在遗传变异的赖特-费希尔中性模型下生成样本。
Bioinformatics. 2002 Feb;18(2):337-8. doi: 10.1093/bioinformatics/18.2.337.
4
Times on trees, and the age of an allele.树上的时间与等位基因的年龄。
Theor Popul Biol. 2000 Mar;57(2):109-19. doi: 10.1006/tpbi.1999.1442.
5
Conditional genealogies and the age of a neutral mutant.条件谱系与中性突变体的年龄
Theor Popul Biol. 1999 Oct;56(2):183-201. doi: 10.1006/tpbi.1998.1411.
6
Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling.使用Metropolis-Hastings抽样法从序列数据估计有效种群大小和突变率。
Genetics. 1995 Aug;140(4):1421-30. doi: 10.1093/genetics/140.4.1421.
7
Statistical properties of segregating sites.分离位点的统计特性。
Theor Popul Biol. 1995 Oct;48(2):172-97. doi: 10.1006/tpbi.1995.1025.
8
The sampling theory of selectively neutral alleles.选择性中性等位基因的抽样理论
Theor Popul Biol. 1972 Mar;3(1):87-112. doi: 10.1016/0040-5809(72)90035-4.
9
Addendum to a paper of W. Ewens.
Theor Popul Biol. 1972 Mar;3(1):113-6. doi: 10.1016/0040-5809(72)90036-6.
10
Statistical properties of the number of recombination events in the history of a sample of DNA sequences.DNA序列样本历史中重组事件数量的统计特性。
Genetics. 1985 Sep;111(1):147-64. doi: 10.1093/genetics/111.1.147.