Suppr超能文献

人口规模可变的人口统计模型下的一般三等位基因频率谱。

General triallelic frequency spectrum under demographic models with variable population size.

作者信息

Jenkins Paul A, Mueller Jonas W, Song Yun S

机构信息

Department of Statistics, University of Warwick, Coventry CV4 7AL, United Kingdom.

出版信息

Genetics. 2014 Jan;196(1):295-311. doi: 10.1534/genetics.113.158584. Epub 2013 Nov 8.

Abstract

It is becoming routine to obtain data sets on DNA sequence variation across several thousands of chromosomes, providing unprecedented opportunity to infer the underlying biological and demographic forces. Such data make it vital to study summary statistics that offer enough compression to be tractable, while preserving a great deal of information. One well-studied summary is the site frequency spectrum-the empirical distribution, across segregating sites, of the sample frequency of the derived allele. However, most previous theoretical work has assumed that each site has experienced at most one mutation event in its genealogical history, which becomes less tenable for very large sample sizes. In this work we obtain, in closed form, the predicted frequency spectrum of a site that has experienced at most two mutation events, under very general assumptions about the distribution of branch lengths in the underlying coalescent tree. Among other applications, we obtain the frequency spectrum of a triallelic site in a model of historically varying population size. We demonstrate the utility of our formulas in two settings: First, we show that triallelic sites are more sensitive to the parameters of a population that has experienced historical growth, suggesting that they will have use if they can be incorporated into demographic inference. Second, we investigate a recently proposed alternative mechanism of mutation in which the two derived alleles of a triallelic site are created simultaneously within a single individual, and we develop a test to determine whether it is responsible for the excess of triallelic sites in the human genome.

摘要

获取跨越数千条染色体的DNA序列变异数据集正变得越来越常规,这为推断潜在的生物学和人口统计学力量提供了前所未有的机会。这些数据使得研究汇总统计量变得至关重要,这些统计量要提供足够的压缩以便易于处理,同时保留大量信息。一个经过充分研究的汇总统计量是位点频率谱——在分离位点上,衍生等位基因样本频率的经验分布。然而,以前的大多数理论工作都假设每个位点在其系谱历史中最多经历一次突变事件,对于非常大的样本量来说,这一假设变得越来越站不住脚。在这项工作中,在关于基础合并树中分支长度分布的非常一般的假设下,我们以封闭形式获得了一个最多经历两次突变事件的位点的预测频率谱。在其他应用中,我们获得了在历史上种群大小变化的模型中三等位基因位点的频率谱。我们在两种情况下展示了我们公式的实用性:首先,我们表明三等位基因位点对经历过历史增长的种群参数更敏感,这表明如果它们能够被纳入人口统计学推断中将会很有用。其次,我们研究了一种最近提出的替代突变机制,其中三等位基因位点的两个衍生等位基因在单个个体内同时产生,并且我们开发了一种测试来确定它是否是人类基因组中三等位基因位点过多的原因。

相似文献

1
General triallelic frequency spectrum under demographic models with variable population size.
Genetics. 2014 Jan;196(1):295-311. doi: 10.1534/genetics.113.158584. Epub 2013 Nov 8.
2
Nonparametric coalescent inference of mutation spectrum history and demography.
Proc Natl Acad Sci U S A. 2021 May 25;118(21). doi: 10.1073/pnas.2013798118.
3
The effect of recurrent mutation on the frequency spectrum of a segregating site and the age of an allele.
Theor Popul Biol. 2011 Sep;80(2):158-73. doi: 10.1016/j.tpb.2011.04.001. Epub 2011 Apr 28.
5
Human triallelic sites: evidence for a new mutational mechanism?
Genetics. 2010 Jan;184(1):233-41. doi: 10.1534/genetics.109.110510. Epub 2009 Nov 2.
6
Full likelihood inference from the site frequency spectrum based on the optimal tree resolution.
Theor Popul Biol. 2018 Dec;124:1-15. doi: 10.1016/j.tpb.2018.07.002. Epub 2018 Jul 23.
7
Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories.
G3 (Bethesda). 2017 Nov 6;7(11):3605-3620. doi: 10.1534/g3.117.300259.
9
Calibrating a coalescent simulation of human genome sequence variation.
Genome Res. 2005 Nov;15(11):1576-83. doi: 10.1101/gr.3709305.
10
On the decidability of population size histories from finite allele frequency spectra.
Theor Popul Biol. 2018 Mar;120:42-51. doi: 10.1016/j.tpb.2017.12.008. Epub 2018 Jan 3.

引用本文的文献

1
Identifying rare variants inconsistent with identity-by-descent in population-scale whole-genome sequencing data.
Methods Ecol Evol. 2022 Nov;13(11):2429-2442. doi: 10.1111/2041-210x.13991. Epub 2022 Oct 11.
2
Recurrent mutation in the ancestry of a rare variant.
Genetics. 2023 Jul 6;224(3). doi: 10.1093/genetics/iyad049.
3
Efficiently inferring the demographic history of many populations with allele count data.
J Am Stat Assoc. 2020;115(531):1472-1487. doi: 10.1080/01621459.2019.1635482. Epub 2019 Jul 22.
4
A general framework for moment-based analysis of genetic data.
J Math Biol. 2019 May;78(6):1727-1769. doi: 10.1007/s00285-018-01325-0. Epub 2019 Jan 28.
5
An efficient algorithm for generating the internal branches of a Kingman coalescent.
Theor Popul Biol. 2018 Jul;122:57-66. doi: 10.1016/j.tpb.2017.05.002. Epub 2017 Jul 11.
6
Inferring Demographic History Using Two-Locus Statistics.
Genetics. 2017 Jun;206(2):1037-1048. doi: 10.1534/genetics.117.201251. Epub 2017 Apr 16.
7
Efficient computation of the joint sample frequency spectra for multiple populations.
J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.
8
Coalescence computations for large samples drawn from populations of time-varying sizes.
PLoS One. 2017 Feb 7;12(2):e0170701. doi: 10.1371/journal.pone.0170701. eCollection 2017.
9
Differences in the rare variant spectrum among human populations.
PLoS Genet. 2017 Feb 1;13(2):e1006581. doi: 10.1371/journal.pgen.1006581. eCollection 2017 Feb.
10
Genotype Calling from Population-Genomic Sequencing Data.
G3 (Bethesda). 2017 May 5;7(5):1393-1404. doi: 10.1534/g3.117.039008.

本文引用的文献

1
APPROXIMATE SAMPLING FORMULAS FOR GENERAL FINITE-ALLELES MODELS OF MUTATION.
Adv Appl Probab. 2012 Jun;44(2):408-428. doi: 10.1239/aap/1339878718.
2
Genome-wide fine-scale recombination rate variation in Drosophila melanogaster.
PLoS Genet. 2012;8(12):e1003090. doi: 10.1371/journal.pgen.1003090. Epub 2012 Dec 20.
3
An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selection.
Theor Popul Biol. 2013 Feb;83:1-14. doi: 10.1016/j.tpb.2012.10.006. Epub 2012 Nov 2.
4
Estimating the human mutation rate using autozygosity in a founder population.
Nat Genet. 2012 Nov;44(11):1277-81. doi: 10.1038/ng.2418. Epub 2012 Sep 23.
5
Rate of de novo mutations and the importance of father's age to disease risk.
Nature. 2012 Aug 23;488(7412):471-5. doi: 10.1038/nature11396.
6
Demographic inference using spectral methods on SNP data, with an analysis of the human out-of-Africa expansion.
Genetics. 2012 Oct;192(2):619-39. doi: 10.1534/genetics.112.141846. Epub 2012 Aug 3.
7
An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people.
Science. 2012 Jul 6;337(6090):100-4. doi: 10.1126/science.1217876. Epub 2012 May 17.
8
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.
Science. 2012 Jul 6;337(6090):64-9. doi: 10.1126/science.1219240. Epub 2012 May 17.
9
Recent explosive human population growth has resulted in an excess of rare genetic variants.
Science. 2012 May 11;336(6082):740-3. doi: 10.1126/science.1217283.
10
A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection.
Genetics. 2012 Mar;190(3):1117-29. doi: 10.1534/genetics.111.136929. Epub 2011 Dec 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验