• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于数百个未分型的全基因组对群体历史进行稳健且可扩展的推断。

Robust and scalable inference of population history from hundreds of unphased whole genomes.

作者信息

Terhorst Jonathan, Kamm John A, Song Yun S

机构信息

Department of Statistics, University of California, Berkeley, Berkeley, California, USA.

Computer Science Division, University of California, Berkeley, Berkeley, California, USA.

出版信息

Nat Genet. 2017 Feb;49(2):303-309. doi: 10.1038/ng.3748. Epub 2016 Dec 26.

DOI:10.1038/ng.3748
PMID:28024154
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5470542/
Abstract

It has recently been demonstrated that inference methods based on genealogical processes with recombination can uncover past population history in unprecedented detail. However, these methods scale poorly with sample size, limiting resolution in the recent past, and they require phased genomes, which contain switch errors that can catastrophically distort the inferred history. Here we present SMC++, a new statistical tool capable of analyzing orders of magnitude more samples than existing methods while requiring only unphased genomes (its results are independent of phasing). SMC++ can jointly infer population size histories and split times in diverged populations, and it employs a novel spline regularization scheme that greatly reduces estimation error. We apply SMC++ to analyze sequence data from over a thousand human genomes in Africa and Eurasia, hundreds of genomes from a Drosophila melanogaster population in Africa, and tens of genomes from zebra finch and long-tailed finch populations in Australia.

摘要

最近有研究表明,基于带有重组的谱系过程的推断方法能够以前所未有的细节揭示过去的种群历史。然而,这些方法随着样本量的增加扩展性较差,限制了对近期历史的分辨率,并且它们需要分阶段的基因组,而分阶段的基因组包含可能会严重扭曲推断历史的切换错误。在此,我们展示了SMC++,这是一种新的统计工具,它能够分析比现有方法多几个数量级的样本,同时只需要未分阶段的基因组(其结果与分阶段无关)。SMC++能够联合推断种群大小历史以及分化种群的分裂时间,并且它采用了一种新颖的样条正则化方案,极大地减少了估计误差。我们应用SMC++来分析来自非洲和欧亚大陆一千多个人类基因组的序列数据、来自非洲一个黑腹果蝇种群的数百个基因组,以及来自澳大利亚斑胸草雀和长尾草雀种群的数十个基因组。

相似文献

1
Robust and scalable inference of population history from hundreds of unphased whole genomes.基于数百个未分型的全基因组对群体历史进行稳健且可扩展的推断。
Nat Genet. 2017 Feb;49(2):303-309. doi: 10.1038/ng.3748. Epub 2016 Dec 26.
2
Robust inference of population size histories from genomic sequencing data.从基因组测序数据中推断种群规模历史。
PLoS Comput Biol. 2022 Sep 16;18(9):e1010419. doi: 10.1371/journal.pcbi.1010419. eCollection 2022 Sep.
3
Modeling Human Population Separation History Using Physically Phased Genomes.利用物理定相基因组对人类群体分离历史进行建模。
Genetics. 2017 Jan;205(1):385-395. doi: 10.1534/genetics.116.192963. Epub 2016 Nov 9.
4
Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach.从多个基因组估计可变有效种群大小:一种顺序马尔可夫条件抽样分布方法。
Genetics. 2013 Jul;194(3):647-62. doi: 10.1534/genetics.112.149096. Epub 2013 Apr 22.
5
Inferring whole-genome histories in large population datasets.在大型人群数据集推断全基因组历史。
Nat Genet. 2019 Sep;51(9):1330-1338. doi: 10.1038/s41588-019-0483-y. Epub 2019 Sep 2.
6
High rates of phasing errors in highly polymorphic species with low levels of linkage disequilibrium.在连锁不平衡水平较低的高度多态物种中,定相错误率很高。
Mol Ecol Resour. 2016 Jul;16(4):874-82. doi: 10.1111/1755-0998.12516. Epub 2016 Mar 21.
7
Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection.基于 ARG 的历史人口规模推断在经历选择的人群中的偏差。
Mol Biol Evol. 2024 Jul 3;41(7). doi: 10.1093/molbev/msae118.
8
The inference of sex-biased human demography from whole-genome data.从全基因组数据推断人类性别偏向的人口统计学。
PLoS Genet. 2019 Sep 20;15(9):e1008293. doi: 10.1371/journal.pgen.1008293. eCollection 2019 Sep.
9
Deep Learning for Population Genetic Inference.用于群体遗传推断的深度学习
PLoS Comput Biol. 2016 Mar 28;12(3):e1004845. doi: 10.1371/journal.pcbi.1004845. eCollection 2016 Mar.
10
Distinguishing among complex evolutionary models using unphased whole-genome data through random forest approximate Bayesian computation.利用无相位全基因组数据通过随机森林近似贝叶斯计算区分复杂进化模型。
Mol Ecol Resour. 2021 Nov;21(8):2614-2628. doi: 10.1111/1755-0998.13263. Epub 2020 Oct 25.

引用本文的文献

1
Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes.针对数百个基因组的全基因组谱系进行稳健且准确的贝叶斯推断。
Nat Genet. 2025 Sep 8. doi: 10.1038/s41588-025-02317-9.
2
Inter-Island Whole-Genome Comparison Reveals Micro-Evolutionary Dynamics of the Red Fox, Stimulated Through Post-Glacial Sea-Level Alterations.岛屿间全基因组比较揭示了红狐的微进化动态,这种动态受到冰后期海平面变化的刺激。
Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf152.
3
GHIST 2024: The 1st Genomic History Inference Strategies Tournament.GHIST 2024:第一届基因组历史推断策略竞赛。
bioRxiv. 2025 Aug 11:2025.08.05.668560. doi: 10.1101/2025.08.05.668560.
4
Telomere-to-telomere genome assembly uncovers Wolbachia-driven recurrent male bottleneck effect and selection in a sawfly.端粒到端粒的基因组组装揭示了叶蜂中沃尔巴克氏体驱动的反复出现的雄性瓶颈效应和选择。
Commun Biol. 2025 Aug 13;8(1):1211. doi: 10.1038/s42003-025-08629-0.
5
Local Adaptation Shapes Phenotypic and Genetic Diversity in .局部适应塑造了……中的表型和遗传多样性。 (原文句子不完整)
Genes (Basel). 2025 Jun 23;16(7):729. doi: 10.3390/genes16070729.
6
Assessing the Population Demographic History of the Tsushima Leopard Cat and Its Genetic Divergence Time from Continental Populations.评估对马豹猫的种群人口历史及其与大陆种群的遗传分化时间。
Biology (Basel). 2025 Jul 18;14(7):880. doi: 10.3390/biology14070880.
7
The Vulpes vulpes montana genome provides insights into high-altitude adaptation mechanisms of the Vulpes species.赤狐蒙大拿亚种基因组为洞察赤狐属物种的高海拔适应机制提供了线索。
Commun Biol. 2025 Jul 18;8(1):1070. doi: 10.1038/s42003-025-08450-9.
8
Signatures of soft selective sweeps predominate in the yellow fever mosquito .软选择清除的特征在埃及伊蚊中占主导地位。
bioRxiv. 2025 Jul 10:2025.07.06.663360. doi: 10.1101/2025.07.06.663360.
9
On the use of generative models for evolutionary inference of malaria vectors from genomic data.关于使用生成模型从基因组数据进行疟疾病媒进化推断的研究
bioRxiv. 2025 Jun 27:2025.06.26.661760. doi: 10.1101/2025.06.26.661760.
10
Coalescence and Translation: A Language Model for Population Genetics.合并与翻译:一种用于群体遗传学的语言模型
bioRxiv. 2025 Jun 27:2025.06.24.661337. doi: 10.1101/2025.06.24.661337.

本文引用的文献

1
Inference of complex population histories using whole-genome sequences from multiple populations.利用来自多个群体的全基因组序列推断复杂的群体历史。
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):17115-17120. doi: 10.1073/pnas.1905060116. Epub 2019 Aug 6.
2
Efficient computation of the joint sample frequency spectra for multiple populations.多群体联合样本频率谱的高效计算。
J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.
3
Stable recombination hotspots in birds.鸟类中的稳定重组热点
Science. 2015 Nov 20;350(6263):928-32. doi: 10.1126/science.aad0843.
4
POPULATION GENETICS. Genomic evidence for the Pleistocene and recent population history of Native Americans.群体遗传学。美洲原住民更新世及近代群体历史的基因组证据。
Science. 2015 Aug 21;349(6250):aab3884. doi: 10.1126/science.aab3884. Epub 2015 Jul 21.
5
Genetic evidence for two founding populations of the Americas.美洲两个奠基人群体的遗传学证据。
Nature. 2015 Sep 3;525(7567):104-8. doi: 10.1038/nature14895. Epub 2015 Jul 21.
6
Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum.基于样本频率谱的人口统计学推断准确性的基本限制。
Proc Natl Acad Sci U S A. 2015 Jun 23;112(25):7677-82. doi: 10.1073/pnas.1503717112. Epub 2015 Jun 8.
7
Evidence for archaic adaptive introgression in humans.人类古老适应性基因渗入的证据。
Nat Rev Genet. 2015 Jun;16(6):359-71. doi: 10.1038/nrg3936. Epub 2015 May 12.
8
The SMC' is a highly accurate approximation to the ancestral recombination graph.SMC' 是对祖先重组图的一种高度精确的近似。
Genetics. 2015 May;200(1):343-55. doi: 10.1534/genetics.114.173898. Epub 2015 Mar 17.
9
The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.果蝇基因组枢纽:一个包含623个黑腹果蝇基因组的群体基因组资源,其中197个来自单个祖先分布群体。
Genetics. 2015 Apr;199(4):1229-41. doi: 10.1534/genetics.115.174664. Epub 2015 Jan 27.
10
scrm: efficiently simulating long sequences using the approximated coalescent with recombination.scrm:使用带重组的近似合并过程高效模拟长序列。
Bioinformatics. 2015 May 15;31(10):1680-2. doi: 10.1093/bioinformatics/btu861. Epub 2015 Jan 8.