Suppr超能文献

突变的一般有限等位基因模型的近似抽样公式

APPROXIMATE SAMPLING FORMULAS FOR GENERAL FINITE-ALLELES MODELS OF MUTATION.

作者信息

Bhaskar Anand, Kamm John A, Song Yun S

机构信息

University of California, Berkeley.

出版信息

Adv Appl Probab. 2012 Jun;44(2):408-428. doi: 10.1239/aap/1339878718.

Abstract

Many applications in genetic analyses utilize sampling distributions, which describe the probability of observing a sample of DNA sequences randomly drawn from a population. In the one-locus case with special models of mutation such as the infinite-alleles model or the finite-alleles parent-independent mutation model, closed-form sampling distributions under the coalescent have been known for many decades. However, no exact formula is currently known for more general models of mutation that are of biological interest. In this paper, models with finitely-many alleles are considered, and an urn construction related to the coalescent is used to derive approximate closed-form sampling formulas for an arbitrary irreducible recurrent mutation model or for a reversible recurrent mutation model, depending on whether the number of distinct observed allele types is at most three or four, respectively. It is demonstrated empirically that the formulas derived here are highly accurate when the per-base mutation rate is low, which holds for many biological organisms.

摘要

基因分析中的许多应用都利用抽样分布,抽样分布描述了观察从总体中随机抽取的DNA序列样本的概率。在单基因座情况下,对于诸如无限等位基因模型或有限等位基因亲本独立突变模型等特殊突变模型,在溯祖理论下的封闭形式抽样分布已经为人所知数十年了。然而,对于目前具有生物学意义的更一般的突变模型,尚无确切公式。本文考虑了具有有限多个等位基因的模型,并使用与溯祖理论相关的瓮构造来推导近似封闭形式的抽样公式,对于任意不可约循环突变模型或可逆循环突变模型,分别取决于观察到的不同等位基因类型的数量最多是三个还是四个。经验证明,当每碱基突变率较低时,这里推导的公式非常准确,这在许多生物中都是成立的。

相似文献

1
APPROXIMATE SAMPLING FORMULAS FOR GENERAL FINITE-ALLELES MODELS OF MUTATION.
Adv Appl Probab. 2012 Jun;44(2):408-428. doi: 10.1239/aap/1339878718.
2
Closed-form two-locus sampling distributions: accuracy and universality.
Genetics. 2009 Nov;183(3):1087-103. doi: 10.1534/genetics.109.107995. Epub 2009 Sep 7.
3
AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION.
Ann Appl Probab. 2010 Jun;20(3):1005-1028. doi: 10.1214/09-AAP646.
6
Allele frequency spectra in structured populations: Novel-allele probabilities under the labelled coalescent.
Theor Popul Biol. 2020 Jun;133:130-140. doi: 10.1016/j.tpb.2020.01.002. Epub 2020 Mar 3.
7
The sampling theory of neutral alleles and an urn model in population genetics.
J Math Biol. 1987;25(2):123-59. doi: 10.1007/BF00276386.
8
The effect of recurrent mutation on the frequency spectrum of a segregating site and the age of an allele.
Theor Popul Biol. 2011 Sep;80(2):158-73. doi: 10.1016/j.tpb.2011.04.001. Epub 2011 Apr 28.
9
Exact computation of coalescent likelihood for panmictic and subdivided populations under the infinite sites model.
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):611-8. doi: 10.1109/TCBB.2010.2.
10
Generalization of the Ewens sampling formula to arbitrary fitness landscapes.
PLoS One. 2018 Jan 11;13(1):e0190186. doi: 10.1371/journal.pone.0190186. eCollection 2018.

引用本文的文献

1
Recurrent mutation in the ancestry of a rare variant.
Genetics. 2023 Jul 6;224(3). doi: 10.1093/genetics/iyad049.
2
The stationary distribution of a sample from the Wright-Fisher diffusion model with general small mutation rates.
J Math Biol. 2019 Mar;78(4):1211-1224. doi: 10.1007/s00285-018-1306-y. Epub 2018 Nov 13.
3
Efficient computation of the joint sample frequency spectra for multiple populations.
J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.
4
Mutation Rate Variation is a Primary Determinant of the Distribution of Allele Frequencies in Humans.
PLoS Genet. 2016 Dec 15;12(12):e1006489. doi: 10.1371/journal.pgen.1006489. eCollection 2016 Dec.
5
TRACTABLE DIFFUSION AND COALESCENT PROCESSES FOR WEAKLY CORRELATED LOCI.
Electron J Probab. 2015;20. doi: 10.1214/ejp.v20-3564. Epub 2016 Jun 4.
6
General triallelic frequency spectrum under demographic models with variable population size.
Genetics. 2014 Jan;196(1):295-311. doi: 10.1534/genetics.113.158584. Epub 2013 Nov 8.
7
The effect of single recombination events on coalescent tree height and shape.
PLoS One. 2013 Apr 8;8(4):e60123. doi: 10.1371/journal.pone.0060123. Print 2013.
8
Genome-wide fine-scale recombination rate variation in Drosophila melanogaster.
PLoS Genet. 2012;8(12):e1003090. doi: 10.1371/journal.pgen.1003090. Epub 2012 Dec 20.

本文引用的文献

1
PADÉ APPROXIMANTS AND EXACT TWO-LOCUS SAMPLING DISTRIBUTIONS.
Ann Appl Probab. 2012 Apr 1;22(2):576-607. doi: 10.1214/11-AAP780.
3
The effect of recurrent mutation on the frequency spectrum of a segregating site and the age of an allele.
Theor Popul Biol. 2011 Sep;80(2):158-73. doi: 10.1016/j.tpb.2011.04.001. Epub 2011 Apr 28.
4
AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION.
Ann Appl Probab. 2010 Jun;20(3):1005-1028. doi: 10.1214/09-AAP646.
5
Closed-form two-locus sampling distributions: accuracy and universality.
Genetics. 2009 Nov;183(3):1087-103. doi: 10.1534/genetics.109.107995. Epub 2009 Sep 7.
7
The frequency spectrum of a mutation, and its age, in a general diffusion model.
Theor Popul Biol. 2003 Sep;64(2):241-51. doi: 10.1016/s0040-5809(03)00075-3.
8
Estimate of the mutation rate per nucleotide in humans.
Genetics. 2000 Sep;156(1):297-304. doi: 10.1093/genetics/156.1.297.
9
Estimating the pattern of nucleotide substitution.
J Mol Evol. 1994 Jul;39(1):105-11. doi: 10.1007/BF00178256.
10
Sampling theory for neutral alleles in a varying environment.
Philos Trans R Soc Lond B Biol Sci. 1994 Jun 29;344(1310):403-10. doi: 10.1098/rstb.1994.0079.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验