• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从大样本基因组变异数据中高效推断种群大小历史和基因座特异性突变率。

Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

作者信息

Bhaskar Anand, Wang Y X Rachel, Song Yun S

机构信息

Simons Institute for the Theory of Computing, Berkeley, California 94720, USA; Computer Science Division, University of California, Berkeley, California 94720, USA;

Department of Statistics, University of California, Berkeley, California 94720, USA;

出版信息

Genome Res. 2015 Feb;25(2):268-79. doi: 10.1101/gr.178756.114. Epub 2015 Jan 6.

DOI:10.1101/gr.178756.114
PMID:25564017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4315300/
Abstract

With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions.

摘要

随着人类遗传学研究样本量最近的增加,从基因组变异数据推断历史种群人口统计学的兴趣日益浓厚。在这里,我们提出了一种有效的推断方法,该方法可以扩展到非常大的样本,包含数万或数十万人。具体而言,通过利用在溯祖模型下预期频率谱的分析结果,并借助自动微分技术(这使我们能够精确计算梯度),我们开发了一种非常有效的算法,用于从样本等位基因频率分布推断历史有效种群大小的分段指数模型。我们的方法比以前基于频率谱的人口统计学推断方法快几个数量级。除了推断人口统计学,我们的方法还可以准确估计基因座特异性突变率。我们在模拟数据上对我们的方法进行了广泛验证,并表明它可以准确推断多个近期快速指数增长的时期,这是小样本量难以检测到的信号。最后,我们使用我们的方法分析近期测序研究的数据,包括在数百个基因区域检测的数万人的大样本外显子测序数据集。

相似文献

1
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.从大样本基因组变异数据中高效推断种群大小历史和基因座特异性突变率。
Genome Res. 2015 Feb;25(2):268-79. doi: 10.1101/gr.178756.114. Epub 2015 Jan 6.
2
Inferring Very Recent Population Growth Rate from Population-Scale Sequencing Data: Using a Large-Sample Coalescent Estimator.从群体规模测序数据推断近期群体增长率:使用大样本合并估计器
Mol Biol Evol. 2015 Nov;32(11):2996-3011. doi: 10.1093/molbev/msv158. Epub 2015 Jul 16.
3
Nonparametric coalescent inference of mutation spectrum history and demography.非参数合并推断突变谱历史和人口统计学。
Proc Natl Acad Sci U S A. 2021 May 25;118(21). doi: 10.1073/pnas.2013798118.
4
Genomic inference using diffusion models and the allele frequency spectrum.基于扩散模型和等位基因频率谱的基因组推断。
Curr Opin Genet Dev. 2018 Dec;53:140-147. doi: 10.1016/j.gde.2018.10.001. Epub 2018 Oct 23.
5
Inference of Super-exponential Human Population Growth via Efficient Computation of the Site Frequency Spectrum for Generalized Models.通过广义模型位点频率谱的高效计算推断超指数人口增长
Genetics. 2016 Jan;202(1):235-45. doi: 10.1534/genetics.115.180570. Epub 2015 Oct 8.
6
Robust inference of population size histories from genomic sequencing data.从基因组测序数据中推断种群规模历史。
PLoS Comput Biol. 2022 Sep 16;18(9):e1010419. doi: 10.1371/journal.pcbi.1010419. eCollection 2022 Sep.
7
Efficient computation of the joint sample frequency spectra for multiple populations.多群体联合样本频率谱的高效计算。
J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.
8
On the decidability of population size histories from finite allele frequency spectra.基于有限等位基因频率谱的种群大小历史的可判定性
Theor Popul Biol. 2018 Mar;120:42-51. doi: 10.1016/j.tpb.2017.12.008. Epub 2018 Jan 3.
9
Transition Densities and Sample Frequency Spectra of Diffusion Processes with Selection and Variable Population Size.具有选择和可变种群大小的扩散过程的转移密度与样本频谱
Genetics. 2015 Jun;200(2):601-17. doi: 10.1534/genetics.115.175265. Epub 2015 Apr 14.
10
Coalescent Inference Using Serially Sampled, High-Throughput Sequencing Data from Intrahost HIV Infection.使用来自宿主内HIV感染的连续采样高通量测序数据进行溯祖推断
Genetics. 2016 Apr;202(4):1449-72. doi: 10.1534/genetics.115.177931. Epub 2016 Feb 8.

引用本文的文献

1
Accelerated Bayesian inference of population size history from recombining sequence data.基于重组序列数据的群体大小历史的加速贝叶斯推断。
Nat Genet. 2025 Sep 15. doi: 10.1038/s41588-025-02323-x.
2
The TMRCA of general genealogies in populations with deterministically varying size.大小确定性变化人群中一般谱系的最近共同祖先时间
Theor Popul Biol. 2025 Jul 2;165:1-9. doi: 10.1016/j.tpb.2025.06.002.
3
Isolating selective from non-selective forces using site frequency ratios.利用位点频率比从非选择性力量中分离出选择性力量。

本文引用的文献

1
DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.笛卡尔符号法则与基于基因组变异数据的群体人口统计学模型的可识别性
Ann Stat. 2014;42(6):2469-2493. doi: 10.1214/14-AOS1264. Epub 2014 Oct 20.
2
The impact of population demography and selection on the genetic architecture of complex traits.人口统计学和选择对复杂性状遗传结构的影响。
PLoS Genet. 2014 May 29;10(5):e1004379. doi: 10.1371/journal.pgen.1004379. eCollection 2014.
3
The deleterious mutation load is insensitive to recent population history.
PLoS Genet. 2025 Apr 21;21(4):e1011427. doi: 10.1371/journal.pgen.1011427. eCollection 2025 Apr.
4
A likelihood-based framework for demographic inference from genealogical trees.一种基于似然性的从系谱树进行人口统计学推断的框架。
Nat Genet. 2025 Apr;57(4):865-874. doi: 10.1038/s41588-025-02129-x. Epub 2025 Mar 20.
5
A structured coalescent model reveals deep ancestral structure shared by all modern humans.一个结构化的溯祖模型揭示了所有现代人类共有的深层祖先结构。
Nat Genet. 2025 Apr;57(4):856-864. doi: 10.1038/s41588-025-02117-1. Epub 2025 Mar 18.
6
Detecting deviations from Kingman coalescence using 2-site frequency spectra.使用双位点频率谱检测与金曼合并的偏差。
Genetics. 2025 Apr 17;229(4). doi: 10.1093/genetics/iyaf023.
7
Estimating evolutionary and demographic parameters via ARG-derived IBD.通过基于祖先重组图(ARG)推导的同源片段(IBD)估计进化和群体统计学参数。
PLoS Genet. 2025 Jan 8;21(1):e1011537. doi: 10.1371/journal.pgen.1011537. eCollection 2025 Jan.
8
Characterizing selection on complex traits through conditional frequency spectra.通过条件频率谱表征复杂性状的选择。
Genetics. 2025 Apr 17;229(4). doi: 10.1093/genetics/iyae210.
9
The TMRCA of general genealogies in populations of variable size.大小可变群体中一般系谱的最近共同祖先时间。
bioRxiv. 2024 Sep 24:2024.09.19.613917. doi: 10.1101/2024.09.19.613917.
10
Exact Decoding of a Sequentially Markov Coalescent Model in Genetics.遗传学中顺序马尔可夫合并模型的精确解码
J Am Stat Assoc. 2024;119(547):2242-2255. doi: 10.1080/01621459.2023.2252570. Epub 2023 Oct 3.
有害突变负荷对近期的种群历史不敏感。
Nat Genet. 2014 Mar;46(3):220-4. doi: 10.1038/ng.2896. Epub 2014 Feb 9.
4
The impact of accelerating faster than exponential population growth on genetic variation.加速增长的人口对遗传变异的影响超过指数增长。
Genetics. 2014 Mar;196(3):819-28. doi: 10.1534/genetics.113.158675. Epub 2013 Dec 30.
5
Neutral genomic regions refine models of recent rapid human population growth.中性基因组区域能完善近期人类快速增长的模型。
Proc Natl Acad Sci U S A. 2014 Jan 14;111(2):757-62. doi: 10.1073/pnas.1310398110. Epub 2013 Dec 30.
6
Robust demographic inference from genomic and SNP data.基于基因组和单核苷酸多态性数据的可靠人口统计学推断。
PLoS Genet. 2013 Oct;9(10):e1003905. doi: 10.1371/journal.pgen.1003905. Epub 2013 Oct 24.
7
Population growth inflates the per-individual number of deleterious mutations and reduces their mean effect.人口增长会使每个个体的有害突变数量膨胀,并降低其平均效应。
Genetics. 2013 Nov;195(3):969-78. doi: 10.1534/genetics.113.153973. Epub 2013 Aug 26.
8
Inferring demographic history from a spectrum of shared haplotype lengths.从共享单倍型长度谱推断人口历史。
PLoS Genet. 2013 Jun;9(6):e1003521. doi: 10.1371/journal.pgen.1003521. Epub 2013 Jun 6.
9
Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.从多个基因组估计可变有效种群大小:一种顺序马尔可夫条件抽样分布方法。
Genetics. 2013 Jul;194(3):647-62. doi: 10.1534/genetics.112.149096. Epub 2013 Apr 22.
10
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.对 6515 个外显子组的分析揭示了大多数人类蛋白质编码变异的近期起源。
Nature. 2013 Jan 10;493(7431):216-20. doi: 10.1038/nature11690. Epub 2012 Nov 28.