• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多群体联合样本频率谱的高效计算。

Efficient computation of the joint sample frequency spectra for multiple populations.

作者信息

Kamm John A, Terhorst Jonathan, Song Yun S

机构信息

Department of Statistics, University of California, Berkeley.

Departments of EECS, Statistics, and Integrative Biology, University of California, Berkeley.

出版信息

J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.

DOI:10.1080/10618600.2016.1159212
PMID:28239248
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5319604/
Abstract

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

摘要

群体遗传学中的大量研究都采用了样本频率谱(SFS),它是一种汇总统计量,描述了DNA序列样本中多态性位点处突变等位基因的分布,并能对大规模群体基因组变异数据进行高效的降维处理。最近,人们对分析来自多个群体的联合SFS数据以推断复杂人口历史参数产生了浓厚兴趣,这些参数包括可变的群体大小、群体分裂时间、迁移率、混合比例等等。基于SFS的推断方法需要在给定的人口模型下准确计算预期的SFS。尽管在方法上已经取得了很大进展,但当涉及多个群体且样本量较大时,现有方法存在数值不稳定性和高计算复杂性的问题。在本文中,我们提出了新的解析公式和算法,能够对从数百个通过具有任意群体大小历史(包括分段指数增长)的复杂人口模型相关的群体中抽取的数千个个体准确、高效地计算预期的联合SFS。我们的结果在一个名为 (用于推断的莫兰模型)的新软件包中得以实现。通过实证研究,我们展示了在数值稳定性和计算复杂性方面的改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/d6e7650fa1ec/nihms777150f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/961eeb7d01fa/nihms777150f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/1cda1d322213/nihms777150f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/c83ec4989599/nihms777150f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/d266daffbfe2/nihms777150f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/b96a4409ff8d/nihms777150f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/27a0630d6231/nihms777150f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/d6e7650fa1ec/nihms777150f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/961eeb7d01fa/nihms777150f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/1cda1d322213/nihms777150f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/c83ec4989599/nihms777150f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/d266daffbfe2/nihms777150f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/b96a4409ff8d/nihms777150f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/27a0630d6231/nihms777150f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf0/5319604/d6e7650fa1ec/nihms777150f7.jpg

相似文献

1
Efficient computation of the joint sample frequency spectra for multiple populations.多群体联合样本频率谱的高效计算。
J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.
2
DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.笛卡尔符号法则与基于基因组变异数据的群体人口统计学模型的可识别性
Ann Stat. 2014;42(6):2469-2493. doi: 10.1214/14-AOS1264. Epub 2014 Oct 20.
3
Geometry of the Sample Frequency Spectrum and the Perils of Demographic Inference.样本频率谱的几何形状和人口推断的危险
Genetics. 2018 Oct;210(2):665-682. doi: 10.1534/genetics.118.300733. Epub 2018 Jul 31.
4
Efficiently inferring the demographic history of many populations with allele count data.利用等位基因计数数据高效推断多个群体的人口历史。
J Am Stat Assoc. 2020;115(531):1472-1487. doi: 10.1080/01621459.2019.1635482. Epub 2019 Jul 22.
5
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.从大样本基因组变异数据中高效推断种群大小历史和基因座特异性突变率。
Genome Res. 2015 Feb;25(2):268-79. doi: 10.1101/gr.178756.114. Epub 2015 Jan 6.
6
Inference of Super-exponential Human Population Growth via Efficient Computation of the Site Frequency Spectrum for Generalized Models.通过广义模型位点频率谱的高效计算推断超指数人口增长
Genetics. 2016 Jan;202(1):235-45. doi: 10.1534/genetics.115.180570. Epub 2015 Oct 8.
7
Accuracy of Demographic Inferences from the Site Frequency Spectrum: The Case of the Yoruba Population.基于位点频率谱的人口统计学推断准确性:约鲁巴人群的案例
Genetics. 2017 May;206(1):439-449. doi: 10.1534/genetics.116.192708. Epub 2017 Mar 24.
8
Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories.单基因组与等位基因频率数据的比较揭示了不一致的人口历史。
G3 (Bethesda). 2017 Nov 6;7(11):3605-3620. doi: 10.1534/g3.117.300259.
9
Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum.基于样本频率谱的人口统计学推断准确性的基本限制。
Proc Natl Acad Sci U S A. 2015 Jun 23;112(25):7677-82. doi: 10.1073/pnas.1503717112. Epub 2015 Jun 8.
10
An algorithm for computing the gene tree probability under the multispecies coalescent and its application in the inference of population tree.一种用于计算多物种溯祖模型下基因树概率的算法及其在种群树推断中的应用。
Bioinformatics. 2016 Jun 15;32(12):i225-i233. doi: 10.1093/bioinformatics/btw261.

引用本文的文献

1
Accelerated Bayesian inference of population size history from recombining sequence data.基于重组序列数据的群体大小历史的加速贝叶斯推断。
Nat Genet. 2025 Sep 15. doi: 10.1038/s41588-025-02323-x.
2
Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes.针对数百个基因组的全基因组谱系进行稳健且准确的贝叶斯推断。
Nat Genet. 2025 Sep 8. doi: 10.1038/s41588-025-02317-9.
3
Effective Population Size Estimation in Large Marine Populations: Considering Current Challenges and Opportunities When Simulating Large Data Sets With High-Density Genomic Information.

本文引用的文献

1
DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.笛卡尔符号法则与基于基因组变异数据的群体人口统计学模型的可识别性
Ann Stat. 2014;42(6):2469-2493. doi: 10.1214/14-AOS1264. Epub 2014 Oct 20.
2
scrm: efficiently simulating long sequences using the approximated coalescent with recombination.scrm:使用带重组的近似合并过程高效模拟长序列。
Bioinformatics. 2015 May 15;31(10):1680-2. doi: 10.1093/bioinformatics/btu861. Epub 2015 Jan 8.
3
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.
大型海洋种群有效种群大小的估计:在利用高密度基因组信息模拟大型数据集时考虑当前的挑战与机遇
Evol Appl. 2025 Jul 28;18(8):e70121. doi: 10.1111/eva.70121. eCollection 2025 Aug.
4
A structured coalescent model reveals deep ancestral structure shared by all modern humans.一个结构化的溯祖模型揭示了所有现代人类共有的深层祖先结构。
Nat Genet. 2025 Apr;57(4):856-864. doi: 10.1038/s41588-025-02117-1. Epub 2025 Mar 18.
5
A General Framework for Branch Length Estimation in Ancestral Recombination Graphs.祖先重组图中分支长度估计的通用框架。
bioRxiv. 2025 Feb 15:2025.02.14.638385. doi: 10.1101/2025.02.14.638385.
6
Leveraging graphical model techniques to study evolution on phylogenetic networks.利用图形模型技术研究系统发育网络上的进化。
Philos Trans R Soc Lond B Biol Sci. 2025 Feb 13;380(1919):20230310. doi: 10.1098/rstb.2023.0310. Epub 2025 Feb 20.
7
Allele ages provide limited information about the strength of negative selection.等位基因年龄提供的关于负选择强度的信息有限。
Genetics. 2025 Mar 17;229(3). doi: 10.1093/genetics/iyae211.
8
Characterizing selection on complex traits through conditional frequency spectra.通过条件频率谱表征复杂性状的选择。
Genetics. 2025 Apr 17;229(4). doi: 10.1093/genetics/iyae210.
9
Exact Decoding of a Sequentially Markov Coalescent Model in Genetics.遗传学中顺序马尔可夫合并模型的精确解码
J Am Stat Assoc. 2024;119(547):2242-2255. doi: 10.1080/01621459.2023.2252570. Epub 2023 Oct 3.
10
Conditional frequency spectra as a tool for studying selection on complex traits in biobanks.条件频率谱作为研究生物样本库中复杂性状选择的工具。
bioRxiv. 2024 Jun 17:2024.06.15.599126. doi: 10.1101/2024.06.15.599126.
从大样本基因组变异数据中高效推断种群大小历史和基因座特异性突变率。
Genome Res. 2015 Feb;25(2):268-79. doi: 10.1101/gr.178756.114. Epub 2015 Jan 6.
4
APPROXIMATE SAMPLING FORMULAS FOR GENERAL FINITE-ALLELES MODELS OF MUTATION.突变的一般有限等位基因模型的近似抽样公式
Adv Appl Probab. 2012 Jun;44(2):408-428. doi: 10.1239/aap/1339878718.
5
Neutral genomic regions refine models of recent rapid human population growth.中性基因组区域能完善近期人类快速增长的模型。
Proc Natl Acad Sci U S A. 2014 Jan 14;111(2):757-62. doi: 10.1073/pnas.1310398110. Epub 2013 Dec 30.
6
General triallelic frequency spectrum under demographic models with variable population size.人口规模可变的人口统计模型下的一般三等位基因频率谱。
Genetics. 2014 Jan;196(1):295-311. doi: 10.1534/genetics.113.158584. Epub 2013 Nov 8.
7
Robust demographic inference from genomic and SNP data.基于基因组和单核苷酸多态性数据的可靠人口统计学推断。
PLoS Genet. 2013 Oct;9(10):e1003905. doi: 10.1371/journal.pgen.1003905. Epub 2013 Oct 24.
8
Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models.利用具有多态性感知的系统发生模型来关联大猿类跨时间尺度的基因组进化。
Mol Biol Evol. 2013 Oct;30(10):2249-62. doi: 10.1093/molbev/mst131. Epub 2013 Aug 1.
9
Intercoalescence time distribution of incomplete gene genealogies in temporally varying populations, and applications in population genetic inference.时变种群中不完整基因谱系的合并时间分布及其在群体遗传推断中的应用。
Ann Hum Genet. 2013 Mar;77(2):158-73. doi: 10.1111/ahg.12007. Epub 2013 Feb 1.
10
Demographic inference using spectral methods on SNP data, with an analysis of the human out-of-Africa expansion.基于 SNP 数据的谱方法进行人口推断,并分析人类走出非洲的扩张。
Genetics. 2012 Oct;192(2):619-39. doi: 10.1534/genetics.112.141846. Epub 2012 Aug 3.